[Trilinos-Users] SerialComm bug?

Bartlett, Roscoe A rabartl at sandia.gov
Fri Oct 15 14:24:00 MDT 2010


Nico,

Calling exit() or abort() (or any function that calls these like MPI_abort(...)) will short circuit the standard destructors that would otherwise get called which will result in the RCPNode errors you are seeing.

For example, if you have:

  int main()
  {
     RCP<A> a = ...;
     RCP<B> b = ...;
     ...
     exit();
     return 0
  }

then the 'A' and 'B' objects will never get deleted, resulting in RCPNode errors like you are seeing.

I just filed the Epic https://software.sandia.gov/bugzilla/show_bug.cgi?id=4969 so that we can track this issue and start making progress on this.

Thanks for pointing this out.

- Ross


> -----Original Message-----
> From: Nico Schlömer [mailto:nico.schloemer at ua.ac.be]
> Sent: Friday, October 15, 2010 12:18 PM
> To: Bartlett, Roscoe A
> Cc: Trilinos Users
> Subject: Re: [Trilinos-Users] SerialComm bug?
> 
>  > If you call exit() then memory will not get cleaned up correctly and
> you will get this working.  Why are you calling exit()?
> 
> No reason.
> My main was basically a copy-and-paste from one of the example files
> within Trilinos. -- Is the use of exit() generally discouraged then? A
> quick
> 
>     find | xargs grep "exit("
> 
> in the Trilinos 10.6.0 source tree returns a total count of 2451. Ha!
> :)
> 
> --Nico
> 
> 
> 
> 
> 
> On 10/15/2010 07:48 PM, Bartlett, Roscoe A wrote:
> > Nico,
> >
> > If you call exit() then memory will not get cleaned up correctly and
> you will get this working.  Why are you calling exit()?
> >
> > - Ross
> >
> >
> >> -----Original Message-----
> >> From: trilinos-users-bounces at software.sandia.gov [mailto:trilinos-
> >> users-bounces at software.sandia.gov] On Behalf Of Nico Schlömer
> >> Sent: Friday, October 15, 2010 6:27 AM
> >> To: Trilinos Users
> >> Subject: [Trilinos-Users] SerialComm bug?
> >>
> >> Hi,
> >>
> >> I just compiled my own code against a DEBUG installation of
> Trilinos,
> >> and every time Teuchos warns me about about and RCP to an
> >> Epetra_SerialComm which couldn't be destroyed properly.
> >> Turns out that creating one and exit() already yields the same
> warning;
> >> bug?
> >>
> >> Example code is attached.
> >>
> >> Linking this again a DEBUG build gives
> >>
> >> ================== *snip* ==================
> >> ***
> >> *** Warning! The following Teuchos::RCPNode objects were created but
> >> have
> >> *** not been destroyed yet.  A memory checking tool may complain
> that
> >> these
> >> *** objects are not destroyed correctly.
> >> ***
> >> *** There can be many possible reasons that this might occur
> including:
> >> ***
> >> ***   a) The program called abort() or exit() before main() was
> >> finished.
> >> ***      All of the objects that would have been freed through
> >> destructors
> >> ***      are not freed but some compilers (e.g. GCC) will still call
> >> the
> >> ***      destructors on static objects (which is what causes this
> >> message
> >> ***      to be printed).
> >> ***
> >> ***   b) The program is using raw new/delete to manage some objects
> and
> >> ***      delete was not called correctly and the objects not deleted
> >> hold
> >> ***      other objects through reference-counted pointers.
> >> ***
> >> ***   c) This may be an indication that these objects may be
> involved
> >> in
> >> ***      a circular dependency of reference-counted managed objects.
> >> ***
> >>
> >>     0: RCPNode (map_key_void_ptr=0x18cbf40)
> >>          Information = {T=Epetra_SerialComm,
> >> ConcreteT=Epetra_SerialComm, p=0x18cbf40, has_ownership=1}
> >>          RCPNode address = 0x18cbfb0
> >>          insertionNumber = 0
> >>
> >> NOTE: To debug issues, open a debugger, and set a break point in the
> >> function where the RCPNode object is first created to determine the
> >> context where the object first gets created.  Each RCPNode object is
> >> given a unique insertionNumber to allow setting breakpoints in the
> >> code.  For example, in GDB one can perform:
> >>
> >> 1) Open the debugger (GDB) and run the program again to get updated
> >> object addresses
> >>
> >> 2) Set a breakpoint in the RCPNode insertion routine when the
> desired
> >> RCPNode is first inserted.  In GDB, to break when the RCPNode with
> >> insertionNumber==3 is added, do:
> >>
> >>     (gdb) b 'Teuchos::RCPNodeTracer::addNewRCPNode( [TAB] [ENTER]
> >>     (gdb) cond 1 insertionNumber==3 [ENTER]
> >>
> >> 3) Run the program in the debugger.  In GDB, do:
> >>
> >>     (gdb) run [ENTER]
> >>
> >> 4) Examine the call stack when the prgoram breaks in the function
> >> addNewRCPNode(...)
> >> ================== *snap* ==================
> >>
> >> on my machine.
> >>
> >> Cheers,
> >> Nico
> >
> 




More information about the Trilinos-Users mailing list