[Trilinos-Users] AztecOO Convergence

Sun Dec 18 22:37:15 MST 2005

Mike,

Thank you again for your prompt follow-up. You are right, the
potential cause of this problem has been discussed before, but
I haven't been able to come up with a solution. So I thought
that using a numerical example will help pinpointing the
problem and identifying an appropriate solution, which you did
in your reply.

I did notice your observation about how the GMRES solution
progresses. Many iterations oscillating about approximately
the same residual, then suddenly the residual falls orders
of magnitude and the solver converges to a solution. Is there
a way to update the conditioner from previous iterations in
order to "incite" the solver to go towards the solution sooner
instead of oscillating that much?

I also noticed that solution time increases with the number
of nodes. I couldn't tell, however, if this is due to increasing
communication cost or increasing number of iterations. In either
case, it might be better to limit the number of solving processors
for AztecOO according to problem size. Is that possible? I mean
if most parts of your parallel program will run faster with 100
nodes except the AztecOO solution part, is there a way to tell
AztecOO utilize only 10 out of the 100 processors for the solution
algorithm? (the data will still be distributed over the 100
processors, which may need to be gathered into the 10 solving
nodes).

41 seems to be the maximum number of nodes that can solve this
particular problem. Is there a rule-of-thumb to calculate the
maximum number of nodes that can be used to solve a linear
problem of a given size?

I realize that most of these issues are due to the approximate
nature of iterative solutions. It would have been easier (and
perhaps more efficient) for my case if AztecOO included a
parallel direct solver. Do you have any plans to add one or
expand the current serial one to work in parallel in the near
future?

Thanks.

-ammar

----- Original Message ----- 
From: "Mike Heroux" <maherou at sandia.gov>
To: "'Ammar T. Al-Sayegh'" <alsayegh at purdue.edu>; <trilinos-users at software.sandia.gov>
Sent: Friday, December 16, 2005 7:28 PM
Subject: RE: [Trilinos-Users] AztecOO Convergence

> Ammar,
> 
> What you are seeing is the interplay between the preconditioner and
> restarted GMRES.  As has been discussed on this list before, overlapping
> Schwarz preconditioners become more "Jacobi-like" as the number of domains
> increase for a fixed-sized problem.  In your case, as the number of
> processors increase, and therefore the preconditioner get weaker, GMRES must
> work harder.  Why you are seeing failure at 10 processors is because you
> finally reach an iteration count where GMRES by default does a "restart".
> Restarting is a efficiency feature of GMRES that attempts to keep the cost
> of iterations in check by occasionally computing the current update to the
> solution, adding it to the initial guess and then starting over.
> Unfortunately for some difficult problems, restarting can actually prohibit
> convergence.  This is your case.
> 
> If you add a statement:
> 
> solver.SetAztecOption(AZ_kspace, 100); // At 10 processors, you could set
> this to 38
> 
> prior to calling solver.Iterate()
> 
> you will see convergence for 10 processors (I got 38 iterations).
> 
> As a general rule, I like to avoid restarting GMRES if at all possible,
> since this gives the best convergence.  As you have seen, for difficult
> problems, the residual will be at 10e-1 for many iterations and then fall to
> near zero in a few iterations.  If you restart, you will miss this
> opportunity.
> 
> Mike
> 
>> -----Original Message-----
>> From: Ammar T. Al-Sayegh [mailto:alsayegh at purdue.edu] 
>> Sent: Friday, December 16, 2005 3:08 PM
>> To: trilinos-users at software.sandia.gov
>> Subject: Re: [Trilinos-Users] AztecOO Convergence
>> 
>> Hello All,
>> 
>> I'm still trying to figure out how to get more consistent 
>> AztecOO results with different number of processors. To 
>> simplify the problem, I wrote this short code.
>> 
>> int main(int argc, char *argv[])
>> {
>>     // init mpi and vars
>>     double norm2;
>>     MPI_Init(&argc,&argv);
>>     Epetra_MpiComm Comm(MPI_COMM_WORLD);
>>     Epetra_Map Map(300, 3, Comm);
>>     Epetra_Vector *b;
>>     Epetra_Vector *x = new Epetra_Vector(Map);
>>     Epetra_CrsMatrix *A;
>> 
>>     // read A & B and solve linear problem
>>     MatrixMarketFileToVector("b", Map, b);
>>     MatrixMarketFileToCrsMatrix("A", Map, A);
>>     Epetra_LinearProblem problem(A, x, b);
>>     AztecOO solver(problem);
>>     solver.Iterate(1000, 10e-6);
>>     x->Norm2(&norm2);
>>     cout << norm2 << endl;
>> 
>>     // finalize mpi
>>     MPI_Finalize();
>> }
>> 
>> The code reads A and b and solves for x, then displays the 
>> norm2 of x. What I need to achieve is to find the solver that 
>> will give me identical number of iterations and norm2 
>> regardless of the number of processors I'm using. So far, I 
>> haven't been able to get that with all the tests I did with 
>> AztecOO. Following are sample results for this code with the 
>> attached A and b files:
>> 
>> 1P)  947.26756007660583 [1 iter; 0.003615 sec]
>> 2P)  947.26754915839501 [6 iter; 0.008490 sec]
>> 4P)  947.26757223410937 [14 iter; 0.115180 sec]
>> 6P)  947.26756549199729 [22 iter; 0.009980 sec]
>> 8P)  947.26755742820296 [30 iter; 0.117318 sec]
>> 10P) Nonconvergent!
>> 
>> As you see, not only the solution varies with the number of 
>> processors, but it becomes nonconvergent when we reach 10 processors.
>> 
>> Any suggestion on how I can modify this code, either by using 
>> different AztecOO options or by using a different solver, so 
>> that I can get solution path and results insensitive to the 
>> number of processors?
>> 
>> Thanks.
>> 
>> 
>> -ammar 
>> 
> 
> 
>