[Trilinos-Users] Printing Objects Causes Hangs

Bartlett, Roscoe A rabartl at sandia.gov
Tue Sep 6 11:24:17 MDT 2005


Ammar,

The problem with selective outputting not working well with in SPMD
programs where collective operations must take place is very common.
The solution I came up with is to always have my code print to a generic
std::ostream object such as:

-------------------------------

void myCode( std::ostream &out )
{
  ...
  Epetra_Vector x(...);
  ...
  out << "Here is the value of x:\n" << x;
}

-------------------------------

Then, I control what processor prints by creating different streams as
follows:

-------------------------------

#include "Teuchos_oblackholestream.hpp"

int main(...)
{
  // MPI stuff
  ...
  int pid;
  MPI_rank(comm,&pid);
  // Set the output stream
  Teuchos::oblackholestream blackhole; // Discards all output!
  std::ostream
    &out = ( pid==0 ? std::cout : blackhole );
  // Call my code
  myCode(out); // Now only the root process will generate output.
  ...
}

-------------------------------

You will find the class Teuchos::oblackholestream in the teuchos
package.  When you write your code to always print to a std::ostream and
never directly to std::cerr or std::cout, then you have full control of
what output gets generated and where it gets generated and you will
never have these types of SMPD problems.

I hope this helps,

Roscoe Bartlett

-----Original Message-----
From: trilinos-users-bounces at software.sandia.gov
[mailto:trilinos-users-bounces at software.sandia.gov] On Behalf Of Michael
A Heroux
Sent: Monday, September 05, 2005 10:31 AM
To: 'Ammar T. Al-Sayegh'
Cc: trilinos-users at software.sandia.gov
Subject: RE: [Trilinos-Users] Printing Objects Causes Hangs

Ammar,

Epetra assumes a so-called Single-Program-Multiple-Data (SPMD) parallel
programming model, although it can (carefully) be used in a more general
Multiple-Instruction-Multiple-Data (MIMD) model.  In an SPMD model, all
processors should participate in each operation that involves one or
more distributed object, e.g., an Epetra_Vector.  

If you want processors doing very different things in parallel, such as
subsets of processors working in tandem, then you might consider
creating MPI communicators that use subsets of MPI processes, then
create Epetra_MpiComm objects with the new MPI communicators, and then
use the Epetra_MpiComm objects to create other Epetra objects on the
subsets of processors.

If you want each processor to behave like a serial processor, but then
occasionally have them work in parallel, you can use a combination of
Epetra distributed objects, e.g., Epetra_Vector, for parallel
operations, and Epetra serial objects, e.g., Epetra_SerialDenseVector,
for serial operations.  You can share data between the distributed and
serial objects using views, but should start out by copying values back
and forth between the serial and distributed objects.

If you can give a more detailed picture of what you are trying to do, I
can try to give more direction.

Mike

> -----Original Message-----
> From: trilinos-users-bounces at software.sandia.gov
> [mailto:trilinos-users-bounces at software.sandia.gov] On Behalf Of Ammar

> T. Al-Sayegh
> Sent: Monday, September 05, 2005 12:46 AM
> To: trilinos-users at software.sandia.gov
> Subject: Re: [Trilinos-Users] Printing Objects Causes Hangs
> 
> Mike,
> 
> Thank you for following up on my question. The "cout << x"
> statement is intended to be within the block, because this example is 
> a simple illustration what I am trying to do in my program. Since each

> process will be doing a different task, but will share some vectors 
> and other variables, I need to check the values of the vectors on 
> individual processes during execution for debugging purposes without 
> having to synchronize the code so that all the processes will execute 
> the cout line at the same time.
> 
> I'm not sure if heading to the right approach for parallelizing my 
> code if my debugging scheme will not work properly. The serial coding 
> went through seamlessly with all the Epetra serial objects. But now 
> that I'm trying to parallelize my program, my MPI-background logic 
> doesn't seem to help much in here. For example, I now know how to 
> distribute objects, but still can't distribute the computation 
> workload properly as any attempt to use "If(Comm.MyPID()){}" to 
> control what gets executed in each processor will cause my program to 
> hang at some Epetra functions.
> 
> 
> -ammar
> 
> 
> ----- Original Message -----
> From: "Michael A Heroux" <maherou at sandia.gov>
> To: "'Ammar T. Al-Sayegh'" <alsayegh at purdue.edu>
> Cc: <trilinos-users at software.sandia.gov>
> Sent: Sunday, September 04, 2005 11:51 PM
> Subject: RE: [Trilinos-Users] Printing Objects Causes Hangs
> 
> 
> > Ammar,
> > 
> > The use of "cout <<" with Epetra distributed objects 
> requires all processors
> > to participate in the operation, even if some have no 
> elements of the
> > object.  If you move the "cout << x;" statement outside the 
> if block, then
> > this code should work.  In fact, you can remove the if 
> statement altogether
> > since the Random() method will work just fine without it.
> > 
> > Mike
> > 
> >> -----Original Message-----
> >> From: trilinos-users-bounces at software.sandia.gov 
> >> [mailto:trilinos-users-bounces at software.sandia.gov] On Behalf 
> >> Of Ammar T. Al-Sayegh
> >> Sent: Saturday, September 03, 2005 9:35 PM
> >> To: trilinos-users at software.sandia.gov
> >> Subject: [Trilinos-Users] Printing Objects Causes Hangs
> >> 
> >> Hi All,
> >> 
> >> When I try to print objects to standard output from a single 
> >> processor, the program hangs. Consider the following code 
> >> with two processors, for example:
> >> 
> >>     // create the map. all global elements are in P0
> >>     int nge = 0;
> >>     if(Comm.MyPID() == 0)
> >>         nge = 4;
> >>     Epetra_Map Map(-1, nge, 0, Comm);
> >> 
> >>     // create the vector, populate it, and print it from P0
> >>     Epetra_Vector x(Map);
> >>     if(Comm.MyPID() == 0)
> >>     {
> >>         x.Random();
> >>         cout << x;
> >>     }
> >> 
> >> The program hangs after printing x to standard output as P1 
> >> terminates successfully while P0 doesn't terminate. Anyone 
> >> experienced the same problem? How can it be resolved?
> >> 
> >> Thanks.
> >> 
> >> 
> >> -ammar
> >> 
> >> _______________________________________________
> >> Trilinos-Users mailing list
> >> Trilinos-Users at software.sandia.gov
> >> http://software.sandia.gov/mailman/listinfo/trilinos-users
> >> 
> > 
> >
> 
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
> 


_______________________________________________
Trilinos-Users mailing list
Trilinos-Users at software.sandia.gov
http://software.sandia.gov/mailman/listinfo/trilinos-users



More information about the Trilinos-Users mailing list