[Trilinos-Users] Tpetra+Anasazi performance

Heroux, Michael A maherou at sandia.gov
Wed May 14 17:10:32 MDT 2014


Kokkos, the manycore/accelerator package under Tpetra, has gone through significant refactoring and scales very well now, but it is not available under Tpetra in the current release.  It will be in the next.


From: David Hysom <hysom1 at llnl.gov<mailto:hysom1 at llnl.gov>>
Date: Wednesday, May 14, 2014 4:32 PM
To: Karen Devine <kddevin at sandia.gov<mailto:kddevin at sandia.gov>>, "trilinos-users at software.sandia.gov<mailto:trilinos-users at software.sandia.gov>" <trilinos-users at software.sandia.gov<mailto:trilinos-users at software.sandia.gov>>
Subject: Re: [Trilinos-Users] [EXTERNAL] Tpetra+Anasazi performance

Since we build our matrices on a single processor, I can easily
code up a 7D 3D stencil matrix.

The graph we've been using are either scale-free (Amazon co-purchase),
or random (synthetically generated on-the-fly)

On 05/14/2014 12:19 PM, Devine, Karen D wrote:
I am curious to know whether the scaling behavior is similar for a more regular matrix
(e.g., something like a Laplace3D matrix from Galeri).

Maybe we could scrape up some code from Galeri or Zoltan2 that we could try.
For example, Trilinos/packages/zoltan2/test/helpers/UserInputForTests.hpp has method buildCrsMatrix that generates Laplace3D matrices from Galeri.


On 5/14/14 11:36 AM, "David Hysom" <hysom1 at llnl.gov<mailto:hysom1 at llnl.gov>> wrote:


Please see the attached for an example of what we're seeing for
strong scaling for LOBPCG+Tpetra, in a shared memory environment.
Bottom line is, this example shows a speedup of 3.0
We've run many problems with varying parameters/matrices, and typically
only see speedups between 2.0 and 3.0

Is this expected? Is there anything wrt upcoming trilinos development
that might increase scalability?

Stats are for trilinos-11.6.1

A second question: we've tested with OpenMP, Pthreads, and TBB.
We always find that OpenMP gives the best results (shortest execution
time), although Pthreads and TBB are reasonably close. Do you know
of circumstances (not limited to Anasazi) where Pthreads or TBB
outperform OpenMP?

thanks, David

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://software.sandia.gov/pipermail/trilinos-users/attachments/20140514/aa71cf03/attachment.html>

More information about the Trilinos-Users mailing list