[Trilinos-Users] Belos GMRES results changing with MPI rank count

Alicia Klinvex aklinvex at gmail.com
Wed Jun 8 11:28:23 EDT 2016


Hello Joseph,

I recommend you try calling the MatrixMarket writer to ensure that your
operators are in fact the same:
https://trilinos.org/docs/dev/packages/tpetra/doc/html/classTpetra_1_1MatrixMarket_1_1Writer.html#a6fcc0d5884f6b7ec0016dc628b170e86

I don't know if you've ever used MatrixMarket before, but it's a filetype
that you can easily read into Matlab for debugging (assuming your matrix is
small enough/computer is big enough).  I would start by calling that writer
from 1, 2, and 4 MPI processes and using Matlab to determine whether those
matrices are actually the same.  I would also use Matlab to
estimate/compute the condition number.  I assume you're aware that you can
get small differences in your mat-vec with different numbers of MPI
processes, just because the order of operations is different, and those
differences can be magnified like any other error.

Best wishes,
Alicia

On Wed, Jun 8, 2016 at 9:17 AM, Rutherford, Joseph M <jmruther at illinois.edu>
wrote:

> All,
>
>
>
> My custom operator expresses a diagonalized dense operator as a cascade of
> sparse operators.  If A,B,C are Tpetra::Operators, I is an optional
> identity Operator effecting Tpetra::Import, and x,y are MultiVectors, then
> my custom operator is executing y=(A*B+C)*I*x.   My system is
>
> 1.)    Vectors x,y are 1:1 distributed in a non-uniform map.
>
> 2.)    apply() consistently computes the same mat-vec with MPI rank
> counts 1 and 2.
>
> 3.)    Belos GMRES converges to validated answers with 1 MPI rank.
>
> 4.)    Belos GMRES converges to different answers with MPI rank counts 1,
> 2, and 4.
>
>
>
> Points 1 and 2 suggest that perhaps the operator itself is working
> correctly. I have no idea what I might be doing incorrectly with Belos.
> The only “GMRES” solver parameters I’m defining are “Convergence Tolerance”
> and “Verbosity”.  For verbosity=127 (all enums summed together), I get the
> following output:
>
>
>
> <console>
>
> Belos::StatusTestGeneralOutput: Passed
>
>   (Num calls,Mod test,State test): (153, 1, Passed)
>
>    Passed.......OR Combination ->
>
>      OK...........Number of Iterations = 152 < 1000
>
>      Converged....(2-Norm Res Vec) / (2-Norm Prec Res0)
>
>                   residual [ 0 ] = 8.47249e-07 < 1e-06
>
>
>
> Passed.......OR Combination ->
>
>   OK...........Number of Iterations = 152 < 1000
>
>   Converged....(2-Norm Res Vec) / (2-Norm Prec Res0)
>
>                residual [ 0 ] = 8.47249e-07 < 1e-06
>
>
>
>
>
>
> =========================================================================================================================
>
>
>
>                                           TimeMonitor results over 4
> processors
>
>
>
> Timer Name                                        MinOverProcs
> MeanOverProcs    MaxOverProcs    MeanOverCallCounts
>
>
> -------------------------------------------------------------------------------------------------------------------------
>
> Belos: Operation Op*x                             9.564 (154)     9.717
> (154)      9.883 (154)     0.0631 (154)
>
> Belos: Operation Prec*x                           0 (0)           0
> (0)            0 (0)           0 (0)
>
> Belos: Orthogonalization                          0.1221 (153)    0.2889
> (153)     0.4433 (153)    0.001888 (153)
>
> Belos: PseudoBlockGmresSolMgr total solve time    9.949 (1)       9.949
> (1)        9.95 (1)        9.949 (1)
>
>
> =========================================================================================================================
>
> </console>
>
>
>
> Can anyone please suggest how to better diagnose the problem?
>
>
>
> Joe
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at trilinos.org
> https://trilinos.org/mailman/listinfo/trilinos-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://trilinos.org/pipermail/trilinos-users/attachments/20160608/2bffea4b/attachment.html>


More information about the Trilinos-Users mailing list