[Trilinos-Users] Tpetra efficiency

Holger Brandsmeier holger.brandsmeier at sam.math.ethz.ch
Thu Jul 21 06:26:13 MDT 2011


Einer,

VbrMatrix is also a sparse matrix, just a different type of sparse Matrix.

Can't you use Teuchos::SerialDenseMatrix for your needs, in particular
as you talked about using a single thread. This is what
Teuchos::SerialDenseMatrix has been developed for. For the Scalar
types double and complex<double> it uses lapack and blas routines
which are very efficient for dense matrices.

Unfortunately I do not know about any dense _parallel_ matrix class in
Trilinos. I don't even know about many dense parallel matrix
implementations outside of trilinos, in particular not templated as in
Tpetra / Teuchos. Maybe someone else knows more about this.

-Holger

On Thu, Jul 21, 2011 at 13:47, Einar Otnes <eotnes at gmail.com> wrote:
> Holger,
> Thank you for the prompt response. I didn't think about the dense vs sparse
> matrix considerations. Thank you for pointing that out. Do you know whether
> there is a dense matrix class in Tpetra?
> I was thinking of trying out VbrMatrix using a single block, but I haven't
> been able to make that work, yet. Is that a possible path ?
>
> Thanks again,
> Einar
>
> On Thu, Jul 21, 2011 at 12:34 PM, Holger Brandsmeier
> <holger.brandsmeier at sam.math.ethz.ch> wrote:
>>
>> Dear Einar,
>>
>> Tpetra::CrsMatrix is a sparse Matrix class, Teuchos::SerialDenseMatrix
>> is a dense matrix as the name implies. I assume that the matrix you
>> actually test this with is also dense, then the difference you observe
>> is certainly to be expected.
>>
>> Note that sparse matrices are matrices where many entries are zero.
>> When there are many zero entries Tpetra::CrsMatrix is fast. But when
>> all entries are different than zero, then Tpetra::CrsMatrix is slower,
>> as it has not been desinged for it. The factor of 2 is quite low,
>> considering that you are using Tpetra::CrsMatrix for something it has
>> not been designed for.
>>
>> -Holger
>>
>> On Thu, Jul 21, 2011 at 13:08, Einar Otnes <eotnes at gmail.com> wrote:
>> > Dear experts,
>> >
>> > I have been testing the performance of a simple matrix multiplication
>> > using
>> > 2 different Matrix/Vector classes on a single node and using a single
>> > thread, i.e. Teuchos::SerialDenseMatrix/Vector and Tpetra::CrsMatrix and
>> > Tpetra::Vector, and I have seen that the run time for evaluating the
>> > same
>> > product differs by a factor ~2  between the SerialDenseMatrix and
>> > CrsMatrix.
>> > Is this  behaviour expected when running on a single node/single thread?
>> > Or
>> > is the way I'm the Tpetra matrices making my code inefficient? Below
>> > follows
>> > the output from my code showing the time it takes to run 200 evaluations
>> > of
>> > Ax=b where the size of the matrix A is 5000x5000.
>> >
>> > I have attached the code I wrote to produce the results below.
>> >
>> > All the best,
>> > Einar Otnes
>> >
>> >
>> >
>> > ==========================================================================
>> >
>> > Teuchos in Trilinos 10.6.4
>> > Tpetra in Trilinos 10.6.4
>> >
>> > Evaluate Ax=b.
>> > Problem size: numRows m= 5000 numCols n= 5000
>> > Ax=b will be evaluated 200 time(s).
>> > Initialize the Matrices and Vectors with random numbers
>> > Start the Calculations!!
>> > Done Ax=b using Teuchos::SerialDenseMatrix
>> > Done Ax=b using Tpetra::CrsMatrix
>> > Done Ax=b using Thyra/Tpetra adapter
>> >
>> > Calculate norm(b) for each of the three methods applied.
>> > bNorm= 1671.55
>> > dNorm= 1671.55
>> > b2Norm= 1671.55
>> >
>> > Tpetra Vector
>> >  Tpetra::Vector<double, int, int, Kokkos::TPINode>{length=5000}
>> >  node    0: local length=5000
>> >
>> > Thyra wrapped Tpetra Vector
>> >  Thyra::TpetraVector<double, int, int,
>> > Kokkos::TPINode>{spmdSpace=Thyra::TpetraVectorSpace<double, int, int,
>> >
>> > Kokkos::TPINode>{globalDim=5000,localSubDim=5000,localOffset=0,comm=Teuchos::SerialComm<long
>> > int>}}
>> >
>> > ================================================================================
>> >
>> >                               TimeMonitor Results
>> >
>> > Timer Name                      Local time (num calls)
>> >
>> > --------------------------------------------------------------------------------
>> > SerialDenseMatrix apply Time    3.69 (200)
>> > CrsMatrix apply Time            7.53 (200)
>> > Thyra CrsMatrix apply Time      7.43 (200)
>> >
>> > ================================================================================
>> >
>> >
>> > _______________________________________________
>> > Trilinos-Users mailing list
>> > Trilinos-Users at software.sandia.gov
>> > http://software.sandia.gov/mailman/listinfo/trilinos-users
>> >
>> >
>>
>>
>>
>> --
>> Holger Brandsmeier, SAM, ETH Zürich
>> http://www.sam.math.ethz.ch/people/bholger
>
>



-- 
Holger Brandsmeier, SAM, ETH Zürich
http://www.sam.math.ethz.ch/people/bholger




More information about the Trilinos-Users mailing list