[Trilinos-Users] Cache performance of Trilinos MatVec

Heroux, Michael A maherou at sandia.gov
Thu Aug 6 01:37:35 MDT 2009


James,

I have attached a simple driver program that I derived from Trilinos/packages/epetra/test/BasicPerfTest/cxx_main.cpp.  It calls the OSKI kernels in addition to Epetra_CrsMatrix and Epetra_JadMatrix.  I have included a Makefile that needs both an installed version of Trilinos and OSKI.  I have also included my configure script. But note that I have turned off Fortran kernels, which really impacts the sparse matrix times multivector performance.

Everything seems to work just fine, although I am not getting OSKI to perform any aggressive optimization right now, and am not sure why.  More to do.

Let me know if this helps.

Mike


On 8/4/09 11:12 AM, "James C. Sutherland" <James.Sutherland at utah.edu> wrote:

Mike,

Thanks.  FYI, the dimensionality of the system is hard-coded in the OSKI example so that you have to recompile with the dimensions consistent with the matrix being used.  Perhaps this could be fixed in a future release since this information is available in the MatrixMarket file...

Does this example function for you with your copy of OSKI?  I am seeing errors from OSKI:

***********************************************************
* OSKI Error -9 : Feature not yet implemented
* Occurred at/near '../../src/matcreate.c', line 290.

Additional information:
  Can't find CSR wrapper, liboski_Tid_LTX_oski_WrapCSR()
**********************************************************

***********************************************************
* OSKI Error -15 : Invalid matrix handle
* Occurred at/near '../../src/matmult.c', line 32.

Additional information:
  Please check matrix object, parameter #1 in call to oski_MatMult_Tid()
**********************************************************
OskiVector multiply error

I am wondering if I have a problem with my OSKI installation, or if there is a problem with the Trilinos interface.

James


On Aug 4, 2009, at 9:21 AM, Heroux, Michael A wrote:

James,

 A.dat is not any particular file.  The driver is looking for a Matrix Market format file.  If you don't give it the path/filename of such a file, it assumes that you copied one into the file named ./A.dat for convenience.

 In case you are not familiar with it, Matrix Market format is a simple portable coordinate format for data exchange (google Matrix Market NIST).

 Just so you have something to start with I have attached one sample file.  There are many more at NIST's website.

 Mike


 On 8/4/09 8:59 AM, "James C.Sutherland" <James.Sutherland at utah.edu> wrote:


Mike,

 It turns out that the example file (epetra/example/OSKI/cxx_main.cpp) wants to read a matrix file from disk, and that file isn't distributed with trilinos.  If you have a copy of that "A.dat" file laying around, could you pass it along?

 James

 ---
 James C. Sutherland
 Assistant Professor, Chemical Engineering
 The University of Utah
 50 S. Central Campus Dr, 3290 MEB
 Salt Lake City, UT 84112-9203
 (801) 585-1246
 http://www.che.utah.edu/~sutherland <http://www.che.utah.edu/~sutherland>


 On Aug 4, 2009, at 7:25 AM, James C. Sutherland wrote:


Mike,

 Here are some more details:

 1. When I build OSKI and run tests, I get 7/25 tests failing.  The most common error is

 ***********************************************************
 * OSKI Error -9 : Create-from-CSR routine not implemented.
 * Occurred at/near '../../src/xforms.c', line 432.

 Additional information:
   GCSR matrix type does not implement a method to create a GCSR matrix from CSR format.
 **********************************************************

 This same error shows up when I run my application code using Trilinos.  My guess is that I might have a configuration problem with OSKI.  I am on a Mac.  I had some trouble building OSKI and benefitted from your correspondence with Rich Vuduc to get OSKI built.  I am having trouble with configuration on my Linux box - 64/32 bit issues.


 2. I am using trilinos 9.0.2.  It appears that the OSKI example is not compiled through the trilinos build system.  I will try to compile it stand-alone and see if it runs.


 3. I tried out the alpha version of trilinos 10 using CMake.  The OSKI related files are not compiled into the Epetra library and their associated headers are also not installed.  Digging through some of the CMakeLists.txt files it looks like the support for OSKI installation did not get migrated into the CMake build system.  I wanted to make you aware of that so it didn't slip through the cracks - particularly since the OSKI interface doesn't seem to be part of your regression test suite.


 Are you able to successfully run the OSKI example file in trilinos on Mac?

 James

 ---
 James C. Sutherland
 Assistant Professor, Chemical Engineering
 The University of Utah
 50 S. Central Campus Dr, 3290 MEB
 Salt Lake City, UT 84112-9203
 (801) 585-1246
 http://www.che.utah.edu/~sutherland <http://www.che.utah.edu/~sutherland>


 On Aug 3, 2009, at 11:12 PM, Heroux, Michael A wrote:


James,

  Please send a bit more detail, perhaps off-list, and I can look at it.  I have been working with OSKI myself lately.

  CMake support of additional features is growing but not complete.  If we don't get them all in right away, there is always the manual definition of CXXFLAGS, etc. to help us in the mean time.

  Thanks.

  Mike


  On 8/3/09 6:07 PM, "James C. Sutherland" <James.Sutherland at utah.edu> wrote:



Mike et al,

  I have tried building trilinos with OSKI.  It appears that the OSKI examples are not built (at least in 9.0.2 which I am using).  Do you have local builds or regression tests that are functional with OSKI?  Is there a way of building the examples through the trilinos build system?

  I am having runtime errors when trying to use OSKI matrices rather than CRS matrices in my application.  I am trying to discern whether this is a problem in my usage of Epetra_OskiMatrix or if there is a problem in the trilinos interface to OSKI.

  FYI, it looks like the new CMake build system is even less aware of OSKI.

  James

  ---
  James C. Sutherland
  Assistant Professor, Chemical Engineering
  The University of Utah
  50 S. Central Campus Dr, 3290 MEB
  Salt Lake City, UT 84112-9203
  (801) 585-1246
  http://www.che.utah.edu/~sutherland <http://www.che.utah.edu/~sutherland>


  On Jun 10, 2009, at 8:16 AM, Heroux, Michael A wrote:



James,

   As Alan mentioned, sparse MV is notorious for poor cache performance.  Some of the best work in addressing this issue has been done by the BeBOP project at UC-Berkeley in the OSKI library.  Epetra can use OSKI for sparse operations via Epetra_Oski* classes.  These classes rely on the OSKI library (which you download and build yourself).  The following tech report describes the interface and performance results:

   http://trilinos.sandia.gov/packages/epetra/IanKarlin.pdf

   You might also consider the Epetra_JadMatrix class, which can work well on the latest microprocessors that support streaming well.  Epetra_JadMatrix can be especially useful for very sparse matrices (the 3-4 nonzeros per row collections you have).  It is a just a few line change to your code to try these options.

   Mike


   On 6/10/09 9:01 AM, "Alan Williams" <william at sandia.gov> wrote:






   James,
   For matrix-vector product y = A*x, the core of the sparse matvec is a statement like this:
     y[i] += Acoefs[j]*x[Acols[j]]

   So depending on how the column-indices of A are ordered, lots of cache misses in the x vector could occur.
   A matrix reordering may help in some cases.

   Alan


   > -----Original Message-----
   > From: trilinos-users-bounces at software.sandia.gov
   > [mailto:trilinos-users-bounces at software.sandia.gov] On Behalf
   > Of James C. Sutherland
   > Sent: Tuesday, June 09, 2009 5:23 PM
   > To: trilinos-users at software.sandia.gov
   > Subject: [Trilinos-Users] Cache performance of Trilinos MatVec
   >
   > Does anyone know if there has been a study of, or effort to
   > optimize,
   > cache performance of MatVec operations in Trilinos?
   >
   > Specifically, I am finding that epetra_dcrsmv (sparse matvec) has
   > extremely bad cache performance (lots of cache misses) on an intel
   > chipset that I have.  This seems to be problematic for a range of
   > matrix sizes.  I have very sparse matrices (3-10 nonzero entries per
   > row), and these can range in size from O(10^2-10^10^5) rows.
   >
   > Any thoughts?
   >
   > James
   >
   > _______________________________________________
   > Trilinos-Users mailing list
   > Trilinos-Users at software.sandia.gov
   > http://software.sandia.gov/mailman/listinfo/trilinos-users
   >
   _______________________________________________
   Trilinos-Users mailing list
   Trilinos-Users at software.sandia.gov
   http://software.sandia.gov/mailman/listinfo/trilinos-users





















  <Test_A.mm>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://software.sandia.gov/pipermail/trilinos-users/attachments/20090806/33611b20/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: invoke-configure
Type: application/octet-stream
Size: 452 bytes
Desc: invoke-configure
Url : https://software.sandia.gov/pipermail/trilinos-users/attachments/20090806/33611b20/attachment-0002.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sparse_matvec_test.tgz
Type: application/octet-stream
Size: 7627 bytes
Desc: sparse_matvec_test.tgz
Url : https://software.sandia.gov/pipermail/trilinos-users/attachments/20090806/33611b20/attachment-0003.obj 


More information about the Trilinos-Users mailing list