[Trilinos-Users] EpetraExt mult_A_B
Williams, Alan B
william at sandia.gov
Thu Jul 27 17:53:43 MDT 2006
Thank you for the optimization, it does provide a substantial
performance gain in 3 of the kernels (mult_A_B, mult_Atrans_B and
mult_Atrans_Btrans). I will incorporate the change into the Trilinos
code-base, and it will be included in the upcoming 7.0 release.
The remaining kernel, mult_A_Btrans can't be modified in the same way as
the others, but I found a couple of other optimizations for that routine
that improved its performance also. It is still the slowest one, so more
work is needed there.
As I mentioned, our plan is to replace at least some of these kernels
with a much faster outer-product formulation. I just haven't gotten to
> -----Original Message-----
> From: trilinos-users-bounces at software.sandia.gov
> [mailto:trilinos-users-bounces at software.sandia.gov] On Behalf
> Of Burkhard Doliwa
> Sent: Thursday, July 27, 2006 7:44 AM
> To: trilinos-users at software.sandia.gov
> Subject: [Trilinos-Users] EpetraExt mult_A_B
> Dear EpetraExt authors,
> looking at the EpetraExt matrix-matrix multiplication routine
> (mult_A_B), one sees that its performance scales as n*n with
> the size n
> of square n x n matrices (assuming constant # nnz per
> row). With a
> VERY small modification (s.App.), one can reach the scaling ~n.
> Perhaps similar modifications can be done in the
> transpose-mult routines.
More information about the Trilinos-Users