[Trilinos-Users] Tpetra+Anasazi performance

Hoemmen, Mark mhoemme at sandia.gov
Thu May 15 13:04:18 MDT 2014


On 5/15/14, 12:00 PM, "trilinos-users-request at software.sandia.gov"
<trilinos-users-request at software.sandia.gov> wrote:
>>David Hysom wrote:
>>> Hi,
>>>
>>> Please see the attached for an example of what we're seeing for
>>> strong scaling for LOBPCG+Tpetra, in a shared memory environment.
>>> Bottom line is, this example shows a speedup of 3.0
>>> We've run many problems with varying parameters/matrices, and typically
>>> only see speedups between 2.0 and 3.0
>>>
>>> Is this expected? Is there anything wrt upcoming trilinos development
>>> that might increase scalability?
>>>
>>> Stats are for trilinos-11.6.1
>>>
>>> A second question: we've tested with OpenMP, Pthreads, and TBB.
>>> We always find that OpenMP gives the best results (shortest execution
>>> time), although Pthreads and TBB are reasonably close. Do you know
>>> of circumstances (not limited to Anasazi) where Pthreads or TBB
>>> outperform OpenMP?

Would you mind sharing your use case (matrix, right-hand side, and Anasazi
settings) with me?  We¹re actively working on Tpetra to improve
performance with respect to threads.  Sparse mat-vec and vector operations
should be reasonably fast with threads, and generally do the right thing
with respect to NUMA.  I would recommend using the Pthreads or OpenMP
back-ends for now. 

As Mike Heroux mentioned, we are currently working on thread performance
improvements in Tpetra.  I would recommend using the latest release of
Trilinos, or even the current development version (available via our
public git repository), for best threads performance.

Thanks!
mfh



More information about the Trilinos-Users mailing list