[Trilinos-Users] Results from a scaling study of ML

John Cary cary at colorado.edu
Mon Mar 29 05:09:43 MST 2021


Thanks, James.  So I did

srun -n 32 --distribution=block,block -c 2 
/global/cscratch1/sd/cary/builds-cori-gcc/vsimall-cori-gcc/trilinos-13.0.0/parcomm/packages/ml/examples/BasicExamples/ML_preconditioner.exe

but I am still seeing the same single-node scaling of dropping to 25% 
parallel efficiency.

I can see that it is not the fault of ML, because on my own local 
cluster, which has two
AMD EPYC 7302 16-Core Processor per node, the single-node parallel 
efficiency at 32 processes
is 82%.

So I guess I still do not know how best to launch on cori.

Thx.....John


On 3/28/21 6:18 PM, James Elliott wrote:
> # cores per proc is usually between 1 and 16 (fill up one socket)
>
> I may be off... been a while since I ran there. FYI, cori was really 
> noisy.
>
> cores_per_proc=1
> John, I believe the usual Cori/Haswell slurm launch should look like:
>
> srun_opts=(
> # use cores,v if you want verbosity
> --cpu_bind=cores
> -c $(($cores_per_proc*2))
> # distribution puts ranks on nodes, then sockets
> # block,block - is like aprun default, which fills
> # a socket on a node, then the next socket on the same node
> # the the next node...
> # block,cyclic is/was the default on Cori
> # that will put rank0 on socket0, rank1 on socket1 (same node)
> # and repeat until the node is full. (it will stride your procs
> # between the sockets on the node)
> # This detail caused a few apps pain when Trinity swapped from
> # aprun.
> # Pick block,block or block,cyclic
> --distribution=block,block
> # the usual -n -N stuff
> )
>
> srun "${srun_opts[@]}" ./app ....
>
> On 3/28/2021 5:23 PM, John Cary wrote:
>> Hi All,
>>
>> As promised, we have done scaling studies on the haswell nodes on 
>> Cori at NERSC using ML_preconditioner.exe
>> as compiled, so this is a weak scaling study with 65536 cells/nodes 
>> per processor.  We find a parallel efficiency
>> (speedup/expected speedup) that drops to 25% on 32 processes.
>>
>> Is this expected?
>>
>> Are their command line args to srun that might improve this?  (I 
>> tried various args to --cpu-bind.)
>>
>> I can provide plenty more info (configuration line, how run, ...).
>>
>> Thx.....John
>>
>> On 3/24/21 9:05 AM, John Cary wrote:
>>>
>>>
>>> Thanks, Chris, thanks Jonathan,
>>>
>>> I have found these executables, and we are doing scaling studies now.
>>>
>>> Will report....John
>>>
>>>
>>>
>>> On 3/23/21 9:42 PM, Siefert, Christopher wrote:
>>>> John,
>>>>
>>>> There are some scaling examples in 
>>>> trilinoscouplings/examples/scaling (example_Poisson.cpp and 
>>>> example_Poisson2D.cpp) that use the old stack and might do what you 
>>>> need.
>>>>
>>>> -Chris
>>>
>>>
>>> On 3/23/21 7:48 PM, Hu, Jonathan wrote:
>>>> Hi John,
>>>>
>>>>     ML has a 2D Poisson driver in 
>>>> ml/examples/BasicExamples/ml_preconditioner.cpp.  The cmake target 
>>>> should be either "ML_preconditioner" or "ML_preconditioner.exe". 
>>>> There's a really similar one in ml/examples/XML/ml_XML.cpp that you 
>>>> can drive with an XML deck. Is this what you're after?
>>>>
>>>> Jonathan
>>>>
>>>> On 3/23/21, 5:47 PM, "Trilinos-Users on behalf of John Cary" 
>>>> <trilinos-users-bounces at trilinos.org on behalf of 
>>>> cary at colorado.edu> wrote:
>>>>
>>>>      We are still using the old stack: ML, Epetra, ...
>>>>
>>>>      When we run a simple Poisson solve on our cluster (32 
>>>> cores/node), we
>>>>      see parallel efficiency drop to 4% on one node with 32 cores.  
>>>> So we
>>>>      naturally believe we are doing something wrong.
>>>>
>>>>      Does trilinos come with a simple Poisson-solve executable that 
>>>> we could
>>>>      use to test scaling (to get around the uncertainties of our 
>>>> use of
>>>>      trilinos)?
>>>>
>>>>      Thx.......John Cary
>>>>
>>>>      _______________________________________________
>>>>      Trilinos-Users mailing list
>>>>      Trilinos-Users at trilinos.org
>>>> http://trilinos.org/mailman/listinfo/trilinos-users_trilinos.org
>>>>
>>>>
>>>>
>>>
>>
>>
>> _______________________________________________
>> Trilinos-Users mailing list
>> Trilinos-Users at trilinos.org
>> http://trilinos.org/mailman/listinfo/trilinos-users_trilinos.org
>




More information about the Trilinos-Users mailing list