[Trilinos-Users] slow mv-product with FECrsMatrix

Mon Jan 24 08:28:36 MST 2011

> I just performed some simple timings for one matrix-vector product with
> an Epetra_FECrsMatrix, distributed over 48 cores of a shared-memory
> machine. After the matrix construction, keoMatrix.GlobalAssemble() is
> called to optimize the storage.

GlobalAssemble *probably* optimizes storage, but it depends on your usage. GlobalAssemble optionally calls FillComplete, which optionally optimizes storage...

> RangeMap and DomainMap are (about) show that rows and columns are about
> evenly spread over the cores, and when performing the actual mv-
> product,
> 
>     M->Apply( *epetra_x, *epetra_b );
> 
> epetra_x has the DomainMap and epetra_b has the RangeMap of M.

It's often difficult to figure out how the domain-map and range-map relates to the row-map and column-map, etc.
But the most important factor in the load-balancing of the mat-vec, is the distribution of the nonzeros of your matrix. If the nonzeros are evenly distributed, then the times should be even. I would be interested in seeing the 'min over procs' and 'max over procs' for number-of-nonzeros in your matrix.

Alan