[Trilinos-Users] [EXTERNAL] Re: direct invertion of a projected matrix (Amesos and Epetra_CrsMatrix?)

Thu Mar 23 04:18:34 EDT 2017

Denis,

If your projected matrix problem will only have a few thousand equations and 100 nonzeros per row, it is probably best to redundantly solve the problem on each MPI rank.  I doubt that a distributed memory solver will be much faster, and it will complicated your coding quite a bit.

Also, even though you will apply the inverse multiple times, it is still best to compute a factorization and call the triangular solve multiple times.

It is also not clear that sparse will be better than dense.  With 10% nonzero in the original matrix, call it A, the factors will likely be quite dense.

For a 1000x1000 matrix with 10% density, storage for A is 8MB dense, 1.2MB sparse (.8MB for matrix values, .4MB for 32-bit column indices).  The dense factorization will be done in-place, so there is no extra storage.  The sparse factorization will be done in a copy, so with any reasonable increase in fill you will soon be using 50% or more total storage of the dense approach.  The sparse solver will do fewer operations (probably), but will not perform as well, assuming you have optimized BLAS and a reasonable version of LAPACK.

A 1000x1000 matrix dense factorization is fewer than 1B operations.  A good version of DGETRF/DGETRS will perform the factorization and all solves in less than 1 second wall time.

Solving the dense system redundantly will be the same wall clock time.  If your matrix size is a few thousand, the dense solver may take 10 seconds, but using a highly optimized LAPACK, like the MKL version for example, you can reduce the time.  If you have extra cores available for threaded execution, the time will drop substantially.

Perhaps I have missed an important detail, but hopefully this helps.

Mike

________________________________
From: Denis Davydov <davydden at gmail.com>
Sent: Wednesday, March 22, 2017 5:32 AM
To: Rajamanickam, Sivasankaran
Cc: Heroux, Michael A; trilinos-users
Subject: Re: [EXTERNAL] Re: [Trilinos-Users] direct invertion of a projected matrix (Amesos and Epetra_CrsMatrix?)

Dear Mike and Siva,

Thanks for the prompt reply.

Problem sizes I have in mind are from few hundreds (test case with full matrices without accounting for sparsity) to
few thousands (planned actual runs with sparse matrices, roughly up to 100 non-zero elements per row).

I understand that the performance will not be ideal, yet I assume it should be comparable to what I would get
by using ScaLAPACK directly with full matrices to do LU factorization with MPI, right?
And most importantly it should be better than what I currently have:
a completely sequential matrix with its serial inverse on each MPI core using Lapack.

p.s. for this application I consider direct solvers because (i) ScaLAPACK was used by other researchers in the same context;
(ii) matrix size is still small compared to the index space of FE and I need to use the inverse of a single matrix several times
in different parts of the algorithm.

Regards,
Denis.

On 22 Mar 2017, at 01:11, Rajamanickam, Sivasankaran <srajama at sandia.gov<mailto:srajama at sandia.gov>> wrote:

Denis,
  As Mike said, the performance may not be what you want. Depending on the problem size, number of MPI ranks and number of factorizations vs number of solvers, you might be better of creating distributed sparse matrix and still use a simple sequential sparse solver (assuming memory is not a bottleneck).  The asymptotic number of operations will come into play very quickly in this case.

-Siva

________________________________
From: Trilinos-Users <trilinos-users-bounces at trilinos.org<mailto:trilinos-users-bounces at trilinos.org>> on behalf of Heroux, Michael A <maherou at sandia.gov<mailto:maherou at sandia.gov>>
Sent: Tuesday, March 21, 2017 6:01 PM
To: Denis Davydov; trilinos-users
Subject: [EXTERNAL] Re: [Trilinos-Users] direct invertion of a projected matrix (Amesos and Epetra_CrsMatrix?)

*Performance will be quite poor, I think.
________________________________
From: Trilinos-Users <trilinos-users-bounces at trilinos.org<mailto:trilinos-users-bounces at trilinos.org>> on behalf of Heroux, Michael A <maherou at sandia.gov<mailto:maherou at sandia.gov>>
Sent: Tuesday, March 21, 2017 6:43 PM
To: Denis Davydov; trilinos-users
Subject: [EXTERNAL] Re: [Trilinos-Users] direct invertion of a projected matrix (Amesos and Epetra_CrsMatrix?)

Denis,

You can store a full matrix as a Epetra_CrsMatrix and use Amesos.  Performance will be quite, I think.

Can you give more details about the dimensions of the problems?

Also, are you using an optimized BLAS for your LAPACK computations?

Mike

________________________________
From: Trilinos-Users <trilinos-users-bounces at trilinos.org<mailto:trilinos-users-bounces at trilinos.org>> on behalf of Denis Davydov <davydden at gmail.com<mailto:davydden at gmail.com>>
Sent: Tuesday, March 21, 2017 9:38 AM
To: trilinos-users
Subject: [EXTERNAL] [Trilinos-Users] direct invertion of a projected matrix (Amesos and Epetra_CrsMatrix?)

Dear all,

I would like to have a MPI-parallel inversion of (eventually sparse) matrix of size N being several hundreds.
I put “sparse” in brackets, because for starters i need to consider it full. The matrix represents the
projection of a FE operator on a vector subspace. I guess something similar is probably done in
Anasazi in the context of eigensolvers.

So the main question is if I can use Epetra_CrsMatrix but for now set the sparsity to be full?
If that is possible, i suppose i should be able to use Amesos direct solvers to get LU factorization
of the matrix (say using ScaLAPACK) and then do whatever I want with it.

If there are other classes in Trilinos to do the job better, please let me know.

p.s. currently i do inversion in serial, eventually using Lapack, but this takes too much time.

Sincerely,
Denis.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://trilinos.org/pipermail/trilinos-users/attachments/20170323/e050ce8f/attachment.html>