[Trilinos-Users] Results from a scaling study of ML
John Cary
cary at colorado.edu
Mon Mar 29 05:09:43 MST 2021
Thanks, James. So I did
srun -n 32 --distribution=block,block -c 2
/global/cscratch1/sd/cary/builds-cori-gcc/vsimall-cori-gcc/trilinos-13.0.0/parcomm/packages/ml/examples/BasicExamples/ML_preconditioner.exe
but I am still seeing the same single-node scaling of dropping to 25%
parallel efficiency.
I can see that it is not the fault of ML, because on my own local
cluster, which has two
AMD EPYC 7302 16-Core Processor per node, the single-node parallel
efficiency at 32 processes
is 82%.
So I guess I still do not know how best to launch on cori.
Thx.....John
On 3/28/21 6:18 PM, James Elliott wrote:
> # cores per proc is usually between 1 and 16 (fill up one socket)
>
> I may be off... been a while since I ran there. FYI, cori was really
> noisy.
>
> cores_per_proc=1
> John, I believe the usual Cori/Haswell slurm launch should look like:
>
> srun_opts=(
> # use cores,v if you want verbosity
> --cpu_bind=cores
> -c $(($cores_per_proc*2))
> # distribution puts ranks on nodes, then sockets
> # block,block - is like aprun default, which fills
> # a socket on a node, then the next socket on the same node
> # the the next node...
> # block,cyclic is/was the default on Cori
> # that will put rank0 on socket0, rank1 on socket1 (same node)
> # and repeat until the node is full. (it will stride your procs
> # between the sockets on the node)
> # This detail caused a few apps pain when Trinity swapped from
> # aprun.
> # Pick block,block or block,cyclic
> --distribution=block,block
> # the usual -n -N stuff
> )
>
> srun "${srun_opts[@]}" ./app ....
>
> On 3/28/2021 5:23 PM, John Cary wrote:
>> Hi All,
>>
>> As promised, we have done scaling studies on the haswell nodes on
>> Cori at NERSC using ML_preconditioner.exe
>> as compiled, so this is a weak scaling study with 65536 cells/nodes
>> per processor. We find a parallel efficiency
>> (speedup/expected speedup) that drops to 25% on 32 processes.
>>
>> Is this expected?
>>
>> Are their command line args to srun that might improve this? (I
>> tried various args to --cpu-bind.)
>>
>> I can provide plenty more info (configuration line, how run, ...).
>>
>> Thx.....John
>>
>> On 3/24/21 9:05 AM, John Cary wrote:
>>>
>>>
>>> Thanks, Chris, thanks Jonathan,
>>>
>>> I have found these executables, and we are doing scaling studies now.
>>>
>>> Will report....John
>>>
>>>
>>>
>>> On 3/23/21 9:42 PM, Siefert, Christopher wrote:
>>>> John,
>>>>
>>>> There are some scaling examples in
>>>> trilinoscouplings/examples/scaling (example_Poisson.cpp and
>>>> example_Poisson2D.cpp) that use the old stack and might do what you
>>>> need.
>>>>
>>>> -Chris
>>>
>>>
>>> On 3/23/21 7:48 PM, Hu, Jonathan wrote:
>>>> Hi John,
>>>>
>>>> ML has a 2D Poisson driver in
>>>> ml/examples/BasicExamples/ml_preconditioner.cpp. The cmake target
>>>> should be either "ML_preconditioner" or "ML_preconditioner.exe".
>>>> There's a really similar one in ml/examples/XML/ml_XML.cpp that you
>>>> can drive with an XML deck. Is this what you're after?
>>>>
>>>> Jonathan
>>>>
>>>> On 3/23/21, 5:47 PM, "Trilinos-Users on behalf of John Cary"
>>>> <trilinos-users-bounces at trilinos.org on behalf of
>>>> cary at colorado.edu> wrote:
>>>>
>>>> We are still using the old stack: ML, Epetra, ...
>>>>
>>>> When we run a simple Poisson solve on our cluster (32
>>>> cores/node), we
>>>> see parallel efficiency drop to 4% on one node with 32 cores.
>>>> So we
>>>> naturally believe we are doing something wrong.
>>>>
>>>> Does trilinos come with a simple Poisson-solve executable that
>>>> we could
>>>> use to test scaling (to get around the uncertainties of our
>>>> use of
>>>> trilinos)?
>>>>
>>>> Thx.......John Cary
>>>>
>>>> _______________________________________________
>>>> Trilinos-Users mailing list
>>>> Trilinos-Users at trilinos.org
>>>> http://trilinos.org/mailman/listinfo/trilinos-users_trilinos.org
>>>>
>>>>
>>>>
>>>
>>
>>
>> _______________________________________________
>> Trilinos-Users mailing list
>> Trilinos-Users at trilinos.org
>> http://trilinos.org/mailman/listinfo/trilinos-users_trilinos.org
>
More information about the Trilinos-Users
mailing list