[Trilinos-Users] [EXTERNAL] MTSGS Smoother Behavior

Rajamanickam, Sivasankaran srajama at sandia.gov
Thu Mar 14 16:31:14 EDT 2019


Got it ! This could be due to the underlying coloring / thread scheduling issues. 

One option is to use SGS smoother for a fixed tolerance rather than fixed number of iterations from inside MueLU. This might vary the cost of apply in MueLU itself, but the accuracy will be the same. Can some MueLU developers chime in ?

You could also adjust the number of iterations to account for variance due to threading.

Thanks
Siva

________________________________________
From: Duncan Karnitz <dlk at thermoanalytics.com>
Sent: Thursday, March 14, 2019 1:35 PM
To: Rajamanickam, Sivasankaran
Cc: trilinos-users
Subject: Re: [Trilinos-Users] [EXTERNAL]  MTSGS Smoother Behavior

Siva,

 The case I'm testing is a usual use case for us, where we limit the linear
 solve to a specific number of iterations; expecting that it will not fully
 converge. In this particular case that limit is 18 iterations.

 The issue we see is that when using the MTSGS smoother with 4 threads the
 solution after 18 iterations is not the same between identical runs.

 When using the SGS smoother I can produce the same solution vector every
 time I run the solve with 4 threads.

-Duncan


----- Original Message -----
From: "Sivasankaran Rajamanickam" <srajama at sandia.gov>
To: "dlk" <dlk at thermoanalytics.com>
Cc: "trilinos-users" <trilinos-users at trilinos.org>
Sent: Thursday, March 14, 2019 3:30:47 PM
Subject: Re: [Trilinos-Users] [EXTERNAL]  MTSGS Smoother Behavior

Duncan
  I don't understand what you mean by SGS resolves the issue. You asked for 18 iterations in both SGS and MTSGS, it appears both didn't converge. What am I missing ?

-Siva

________________________________________
From: Duncan Karnitz <dlk at thermoanalytics.com>
Sent: Thursday, March 14, 2019 12:37 PM
To: Rajamanickam, Sivasankaran
Cc: trilinos-users
Subject: Re: [Trilinos-Users] [EXTERNAL]  MTSGS Smoother Behavior

Siva,

 Looks like using SGS resolves the issue. I should also add all of these tests
 are being run with 4 threads.

--------------------------------------------------------------------------------
---                            Multigrid Summary                             ---
--------------------------------------------------------------------------------
Number of levels    = 6
Operator complexity = 1.20
Cycle type          = V

level     rows       nnz  nnz/row  c ratio  procs
  0    2519861  15411766     6.12               1
  1     322910   2858176     8.85     7.80      1
  2      31047    236040     7.60    10.40      1
  3       3198     20039     6.27     9.71      1
  4        467      2533     5.42     6.85      1
  5         76       438     5.76     6.14      1

Smoother (level 0) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [2519861, 2519861], Global nnz: 15411766}

Smoother (level 1) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [322910, 322910], Global nnz: 2858176}

Smoother (level 2) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [31047, 31047], Global nnz: 236040}

Smoother (level 3) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [3198, 3198], Global nnz: 20039}

Smoother (level 4) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [467, 467], Global nnz: 2533}

Smoother (level 5) pre  : KLU2 solver interface
Smoother (level 5) post : no smoother

================================================================================

                      TimeMonitor results over 1 processor

Timer Name          Global time (num calls)
--------------------------------------------------------------------------------
MueLu setup time    1.997 (1)
================================================================================

Belos::StatusTestGeneralOutput: Passed
  (Num calls,Mod test,State test): (19, 1, Passed)
   Passed.......OR Combination ->
     Failed.......Number of Iterations = 18 == 18
     Unconverged..(2-Norm Imp Res Vec)
                  residual [ 0 ] = 9.33933 > 1e-08

================================================================================

----- Original Message -----
From: "Sivasankaran Rajamanickam" <srajama at sandia.gov>
To: "dlk" <dlk at thermoanalytics.com>
Cc: "trilinos-users" <trilinos-users at trilinos.org>
Sent: Thursday, March 14, 2019 1:05:40 PM
Subject: Re: [Trilinos-Users] [EXTERNAL]  MTSGS Smoother Behavior

Duncan
  Thanks for these details ! Do you see this behavior only with MTSGS ? Can you change the smoother to SGS and see what happens ? With MueLU there are lot of other factors other than the smoother involved, so I am trying to understand the problem, by eliminating other causes.

Thanks
Siva

________________________________________
From: Duncan Karnitz <dlk at thermoanalytics.com>
Sent: Thursday, March 14, 2019 10:58 AM
To: Rajamanickam, Sivasankaran
Cc: trilinos-users
Subject: Re: [Trilinos-Users] [EXTERNAL]  MTSGS Smoother Behavior

Siva,

 I've been able to demonstrate with a mini app that with the same matrix, rhs, initial
 guess, and number of threads that no two solves generate the exact same solution. Is
 this expected behavior from the Belos/MueLu/Tpetra stack for our setup? Here is the
 info from the pre-conditioner creation:

--------------------------------------------------------------------------------
---                            Multigrid Summary                             ---
--------------------------------------------------------------------------------
Number of levels    = 6
Operator complexity = 1.20
Cycle type          = V

level     rows       nnz  nnz/row  c ratio  procs
  0    2519861  15411766     6.12               1
  1     322910   2858176     8.85     7.80      1
  2      31047    236040     7.60    10.40      1
  3       3198     20039     6.27     9.71      1
  4        467      2533     5.42     6.85      1
  5         76       438     5.76     6.14      1

Smoother (level 0) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: MT Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [2519861, 2519861], Global nnz: 15411766}

Smoother (level 1) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: MT Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [322910, 322910], Global nnz: 2858176}

Smoother (level 2) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: MT Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [31047, 31047], Global nnz: 236040}

Smoother (level 3) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: MT Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [3198, 3198], Global nnz: 20039}

Smoother (level 4) both : "Ifpack2::Relaxation":
  {Initialized: true, Computed: true, Type: MT Symmetric Gauss-Seidel, sweeps: 1, damping factor: 1, Global matrix dimensions: [467, 467], Global nnz: 2533}

Smoother (level 5) pre  : KLU2 solver interface
Smoother (level 5) post : no smoother

 In this case we are limiting the number of iterations since we generally only
 partially solve the linear system before using the new guesses to update our
 non-linear parameters. Below is the summary from Belos for two runs where all
 parameters are the same:

================================================================================

                      TimeMonitor results over 1 processor

Timer Name          Global time (num calls)
--------------------------------------------------------------------------------
MueLu setup time    1.693 (1)
================================================================================

Belos::StatusTestGeneralOutput: Passed
  (Num calls,Mod test,State test): (19, 1, Passed)
   Passed.......OR Combination ->
     Failed.......Number of Iterations = 18 == 18
     Unconverged..(2-Norm Imp Res Vec)
                  residual [ 0 ] = 25.3539 > 1e-08

================================================================================

================================================================================

                      TimeMonitor results over 1 processor

Timer Name          Global time (num calls)
--------------------------------------------------------------------------------
MueLu setup time    1.714 (1)
================================================================================

Belos::StatusTestGeneralOutput: Passed
  (Num calls,Mod test,State test): (19, 1, Passed)
   Passed.......OR Combination ->
     Failed.......Number of Iterations = 18 == 18
     Unconverged..(2-Norm Imp Res Vec)
                  residual [ 0 ] = 24.371 > 1e-08

================================================================================

I am happy to try and provide more info.

-Duncan

----- Original Message -----
From: "dlk" <dlk at thermoanalytics.com>
To: "Sivasankaran Rajamanickam" <srajama at sandia.gov>
Cc: "trilinos-users" <trilinos-users at trilinos.org>
Sent: Wednesday, March 13, 2019 5:16:03 PM
Subject: Re: [Trilinos-Users] [EXTERNAL]  MTSGS Smoother Behavior

Siva,

 We use Belos BiCGStab, with MueLu preconditioning using the MTSGS smoother as an inner
 loop linear solver. We have a particular problem we are running with four threads have
 significantly different behavior between repeated runs. (i.e. no two solves are the
 same). There are a lot of factors in play, but we believe we have isolated the issue
 to the Belos solver, or the preconditioner & smoother.

 I understand the generalized coloring approach in order to make SGS thread-safe, but
 I would expect between two identical runs of a problem that the coloring would be the
 same. Subsequently I would expect always running with the same number of threads to
 have consistent behavior.

-Duncan

----- Original Message -----
From: "Rajamanickam, Sivasankaran" <srajama at sandia.gov>
To: "dlk" <dlk at thermoanalytics.com>, "trilinos-users" <trilinos-users at trilinos.org>
Sent: Wednesday, March 13, 2019 5:00:05 PM
Subject: Re: [EXTERNAL] [Trilinos-Users] MTSGS Smoother Behavior

Duncan
  MT-SGS uses coloring and multithreading using the coloring. Slight variations in #iterations are expected when used stand-alone. What is the behavior you are seeing ?

Thanks
Siva

________________________________________
From: Trilinos-Users <trilinos-users-bounces at trilinos.org> on behalf of Duncan Karnitz <dlk at thermoanalytics.com>
Sent: Wednesday, March 13, 2019 2:06 PM
To: trilinos-users
Subject: [EXTERNAL] [Trilinos-Users] MTSGS Smoother Behavior

I am investigating some inconsistent multi-threaded behavior when using
the MueLu MTSGS smoother. When using more than one thread, we see that
no two runs of our solve code produce the same convergence behavior.

Is there anything in the MTSGS implementation that might not reproduce
the same computations each time when using more than one thread?

Thank you,
Duncan Karnitz
_______________________________________________
Trilinos-Users mailing list
Trilinos-Users at trilinos.org
https://trilinos.org/mailman/listinfo/trilinos-users
_______________________________________________
Trilinos-Users mailing list
Trilinos-Users at trilinos.org
https://trilinos.org/mailman/listinfo/trilinos-users


More information about the Trilinos-Users mailing list