[Trilinos-Users] [EXTERNAL] test failures

Templeton, Jeremy Alan jatempl at sandia.gov
Wed Jul 9 11:12:15 MDT 2014


Hi Ross,

Thanks for the response.  Just to be clear, the program which is failing is a Trilinos test which I have not modified, specifically Tpetra_ExportToStaticGraphCrsMatrix.  This is an example as nearly 60% of the tests failed in similar fashion, again, without any changes from the distribution.

Jeremy

On Jul 9, 2014, at 9:21 AM, Bartlett, Roscoe A. <bartlettra at ornl.gov<mailto:bartlettra at ornl.gov>> wrote:

Jeremy,

See section:

   “5.11.2 Detection of circular references”

in:

    http://web.ornl.gov/~8vt/TeuchosMemoryManagementSAND.pdf

My guess is that your programs are terminating incorrectly for some reason.  This could be due to a number of different issues but it is not a good sign.  I would not trust software build on this system with these types of failures.

-Ross



From: trilinos-users-bounces at software.sandia.gov<mailto:trilinos-users-bounces at software.sandia.gov> [mailto:trilinos-users-bounces at software.sandia.gov] On Behalf OfTempleton, Jeremy Alan
Sent: Wednesday, July 09, 2014 12:01 PM
To: trilinos-users at software.sandia.gov<mailto:trilinos-users at software.sandia.gov>
Subject: [Trilinos-Users] test failures

Hi all,

I’m trying to get Trilinos 11.8 (tpetra, zoltan2) running on my mac (10.9, gcc4.8, openmpi) but when I build an MPI debug version nearly 60% of the tests fail on exit.  I ran one of them through GDB and the output is below.  When I build the MPI release version with no other configuration changes, all but one test passes.  So this looks like some extra error checking.  How relevant is it?  If it’s not harmful, is there a way to turn it off?

Thanks,
Jeremy

Starting program: /Users/jatempl/Codes/Trilinos_build/packages/tpetra/test/ImportExport/Tpetra_ExportToStaticGraphCrsMatrix.exe
End Result: TEST PASSED

***
*** Warning! The following Teuchos::RCPNode objects were created but have
*** not been destroyed yet.  A memory checking tool may complain that these
*** objects are not destroyed correctly.
***
*** There can be many possible reasons that this might occur including:
***
***   a) The program called abort() or exit() before main() was finished.
***      All of the objects that would have been freed through destructors
***      are not freed but some compilers (e.g. GCC) will still call the
***      destructors on static objects (which is what causes this message
***      to be printed).
***
***   b) The program is using raw new/delete to manage some objects and
***      delete was not called correctly and the objects not deleted hold
***      other objects through reference-counted pointers.
***
***   c) This may be an indication that these objects may be involved in
***      a circular dependency of reference-counted managed objects.
***

  0: RCPNode (map_key_void_ptr=0x101353290)
       Information = {T=Teuchos::SerializationTraits<int, unsigned long>, ConcreteT=Teuchos::SerializationTraits<int, unsigned long>, p=0x101353290, has_ownership=1}
       RCPNode address = 0x1013533f0
       insertionNumber = 14
  1: RCPNode (map_key_void_ptr=0x101353440)
       Information = {T=Teuchos::SerializationTraits<int, int>, ConcreteT=Teuchos::SerializationTraits<int, int>, p=0x101353440, has_ownership=1}
       RCPNode address = 0x101354230
       insertionNumber = 15

NOTE: To debug issues, open a debugger, and set a break point in the function where
the RCPNode object is first created to determine the context where the object first
gets created.  Each RCPNode object is given a unique insertionNumber to allow setting
breakpoints in the code.  For example, in GDB one can perform:

1) Open the debugger (GDB) and run the program again to get updated object addresses

2) Set a breakpoint in the RCPNode insertion routine when the desired RCPNode is first
inserted.  In GDB, to break when the RCPNode with insertionNumber==3 is added, do:

  (gdb) b 'Teuchos::RCPNodeTracer::addNewRCPNode( [TAB] ' [ENTER]
  (gdb) cond 1 insertionNumber==3 [ENTER]

3) Run the program in the debugger.  In GDB, do:

  (gdb) run [ENTER]

4) Examine the call stack when the program breaks in the function addNewRCPNode(...)
libc++abi.dylib: terminating with uncaught exception of type std::logic_error: /Users/jatempl/Codes/trilinos-11.8.1-Source/packages/teuchos/core/src/Teuchos_RCPNode.cpp:497:

Throw number = 1

Throw test that evaluated to true: !(rcp_node_list())

Error!

Program received signal SIGABRT, Aborted.
0x00007fff8d734866 in ?? ()
(gdb) where
#0  0x00007fff8d734866 in ?? ()
#1  0x00007fff9294035c in ?? ()
#2  0x0000000000000000 in ?? ()

--------------------------------------------------------
Jeremy A. Templeton, Ph.D.
Thermal/Fluid Sciences & Engineering
jatempl at sandia.gov<mailto:jatempl at sandia.gov>
http://tiny.sandia.gov/jatempl
925-294-1429


--------------------------------------------------------
Jeremy A. Templeton, Ph.D.
Thermal/Fluid Sciences & Engineering
jatempl at sandia.gov<mailto:jatempl at sandia.gov>
http://tiny.sandia.gov/jatempl
925-294-1429





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://software.sandia.gov/pipermail/trilinos-users/attachments/20140709/35646cb8/attachment-0001.html>


More information about the Trilinos-Users mailing list