[Trilinos-Users] [EXTERNAL] question about using ParMETIS with Trilinos

Siva Rajamanickam srajama at sandia.gov
Fri May 18 14:07:06 MDT 2012


Alicia
  Are you deleting/releasing the Solver/Preconditioners in your test 
program ? If you hold onto Ifpack_Amesos it will have a valid RCP to 
Amesos Solvers, I think.

I am not sure of the error "Attempting to use an MPI routine after 
finalizing MPICH". Did you call finalize before deleting the objects. It 
is quite possible MUMPS needs MPI at the last cleanup. It is also 
possible the first RCP error could be because of the finalize too.

Thanks
Siva

Alicia M Klinvex wrote:
> Hello Karen,
>
> Yes, Heidi Thornquist and Siva Rajamanickam pointed out the link line problem, and I can now use MUMPS with Trilinos.  I have another question that perhaps you can help me with.
>
> I am trying to use LOBPCG to solve a generalized eigenvalue problem with Amesos_Mumps as a preconditioner.  I got Ifpack_ex_Amesos.cpp to run, but if I try to change the amesos solver type from Klu to Mumps, I get the output below.  I ran Amesos_compare_solvers.exe, and it was able to find and use Mumps, so I don't know what is causing this problem.  Do you have any idea why Ifpack will accept Klu but not Mumps?
>
> Thanks,
> Alicia
>
>
>
>                 *******************************************************
>                 ***** Problem: Epetra::CrsMatrix
>                 ***** Preconditioned GMRES solution
>                 ***** Ifpack_AdditiveSchwarz, ov = 0, local solver =
>                 ***** `Amesos_Mumps'
>                 ***** Condition number estimate = 70.6153
>                 ***** No scaling
>                 *******************************************************
>
>                 iter:    0           residual = 1.000000e+00
>                 iter:    1           residual = 1.568841e-15
>
>
>                 Solution time: 0.001528 (sec.)
>                 total iterations: 1
> Attempting to use an MPI routine after finalizing MPICH
>
> ***
> *** Warning! The following Teuchos::RCPNode objects were created but have
> *** not been destroyed yet.  A memory checking tool may complain that these
> *** objects are not destroyed correctly.
> ***
> *** There can be many possible reasons that this might occur including:
> ***
> ***   a) The program called abort() or exit() before main() was finished.
> ***      All of the objects that would have been freed through destructors
> ***      are not freed but some compilers (e.g. GCC) will still call the
> ***      destructors on static objects (which is what causes this message
> ***      to be printed).
> ***
> ***   b) The program is using raw new/delete to manage some objects and
> ***      delete was not called correctly and the objects not deleted hold
> ***      other objects through reference-counted pointers.
> ***
> ***   c) This may be an indication that these objects may be involved in
> ***      a circular dependency of reference-counted managed objects.
> ***
>
>   0: RCPNode (map_key_void_ptr=0x266ac50)
>        Information = {T=Epetra_Map, ConcreteT=Epetra_Map, p=0x266ac50, has_ownership=1}
>        RCPNode address = 0x2669e30
>        insertionNumber = 0
>   1: RCPNode (map_key_void_ptr=0x2669f30)
>        Information = {T=Epetra_CrsMatrix, ConcreteT=Epetra_CrsMatrix, p=0x2669f30, has_ownership=1}
>        RCPNode address = 0x266a110
>        insertionNumber = 1
>   2: RCPNode (map_key_void_ptr=0x266a068)
>        Information = {T=Epetra_CrsMatrix, ConcreteT=Epetra_CrsMatrix, p=0x266a068, has_ownership=0}
>        RCPNode address = 0x266a6e0
>        insertionNumber = 2
>   3: RCPNode (map_key_void_ptr=0x266a550)
>        Information = {T=Ifpack_AdditiveSchwarz<Ifpack_Amesos>, ConcreteT=Ifpack_AdditiveSchwarz<Ifpack_Amesos>, p=0x266a550, has_ownership=1}
>        RCPNode address = 0x266a730
>        insertionNumber = 3
>   4: RCPNode (map_key_void_ptr=0x26911f0)
>        Information = {T=Epetra_Time, ConcreteT=Epetra_Time, p=0x26911f0, has_ownership=1}
>        RCPNode address = 0x2674b90
>        insertionNumber = 4
>   5: RCPNode (map_key_void_ptr=0x2691190)
>        Information = {T=Epetra_MpiComm, ConcreteT=Epetra_MpiComm, p=0x2691190, has_ownership=1}
>        RCPNode address = 0x2674be0
>        insertionNumber = 5
>   6: RCPNode (map_key_void_ptr=0x26750e0)
>        Information = {T=Epetra_Map, ConcreteT=Epetra_Map, p=0x26750e0, has_ownership=1}
>        RCPNode address = 0x266a9d0
>        insertionNumber = 6
>   7: RCPNode (map_key_void_ptr=0x26765a0)
>        Information = {T=Epetra_Vector, ConcreteT=Epetra_Vector, p=0x26765a0, has_ownership=1}
>        RCPNode address = 0x26753d0
>        insertionNumber = 7
>   8: RCPNode (map_key_void_ptr=0x2674fa0)
>        Information = {T=Ifpack_LocalFilter, ConcreteT=Ifpack_LocalFilter, p=0x2674fa0, has_ownership=1}
>        RCPNode address = 0x26754e0
>        insertionNumber = 8
>   9: RCPNode (map_key_void_ptr=0x267e570)
>        Information = {T=Epetra_LinearProblem, ConcreteT=Epetra_LinearProblem, p=0x267e570, has_ownership=1}
>        RCPNode address = 0x267e5c0
>        insertionNumber = 9
>  10: RCPNode (map_key_void_ptr=0x26755f0)
>        Information = {T=Ifpack_Amesos, ConcreteT=Ifpack_Amesos, p=0x26755f0, has_ownership=1}
>        RCPNode address = 0x267e680
>        insertionNumber = 10
>  11: RCPNode (map_key_void_ptr=0x267e700)
>        Information = {T=std::vector<Teuchos::RCP<Epetra_Time>, std::allocator<Teuchos::RCP<Epetra_Time> > >, ConcreteT=std::vector<Teuchos::RCP<Epetra_Time>, std::allocator<Teuchos::RCP<Epetra_Time> > >, p=0x267e700, has_ownership=1}
>        RCPNode address = 0x267eb70
>        insertionNumber = 12
>  12: RCPNode (map_key_void_ptr=0x267eda0)
>        Information = {T=std::vector<Amesos_Time_Data, std::allocator<Amesos_Time_Data> >, ConcreteT=std::vector<Amesos_Time_Data, std::allocator<Amesos_Time_Data> >, p=0x267eda0, has_ownership=1}
>        RCPNode address = 0x267ebc0
>        insertionNumber = 13
>  13: RCPNode (map_key_void_ptr=0x267ee30)
>        Information = {T=Amesos_Mumps, ConcreteT=Amesos_Mumps, p=0x267ee30, has_ownership=1}
>        RCPNode address = 0x267ff00
>        insertionNumber = 14
>  14: RCPNode (map_key_void_ptr=0x267fea0)
>        Information = {T=Epetra_Time, ConcreteT=Epetra_Time, p=0x267fea0, has_ownership=1}
>        RCPNode address = 0x267ec10
>        insertionNumber = 15
>
> NOTE: To debug issues, open a debugger, and set a break point in the function where
> the RCPNode object is first created to determine the context where the object first
> gets created.  Each RCPNode object is given a unique insertionNumber to allow setting
> breakpoints in the code.  For example, in GDB one can perform:
>
> 1) Open the debugger (GDB) and run the program again to get updated object addresses
>
> 2) Set a breakpoint in the RCPNode insertion routine when the desired RCPNode is first
> inserted.  In GDB, to break when the RCPNode with insertionNumber==3 is added, do:
>
>   (gdb) b 'Teuchos::RCPNodeTracer::addNewRCPNode( [TAB] ' [ENTER]
>   (gdb) cond 1 insertionNumber==3 [ENTER]
>
> 3) Run the program in the debugger.  In GDB, do:
>
>   (gdb) run [ENTER]
>
> 4) Examine the call stack when the program breaks in the function addNewRCPNode(...)
>
>
>
> ----- Original Message -----
> From: "Karen D Devine" <kddevin at sandia.gov>
> To: "Alicia M Klinvex" <aklinvex at purdue.edu>, trilinos-users at software.sandia.gov
> Cc: "Michael A Heroux" <maherou at sandia.gov>, "Faisal Saied" <fsaied at purdue.edu>
> Sent: Friday, May 18, 2012 2:43:52 PM
> Subject: Re: [EXTERNAL] [Trilinos-Users] question about using ParMETIS with Trilinos
>
> Alicia:  Have you resolved this problem?  I know the link line needs to
> list libmetis.a AFTER libparmetis.a.  It is not clear from your email what
> order is used by TRIBITS.  Try running CMAKE with
> -D CMAKE_VERBOSE_MAKEFILE:BOOL=ON
> Then check the link line for the unit test with the failing build.  If
> libmetis.a is before libparmetis.a, there is a bug in TRIBITS that needs
> to be fixed; we can report it to the appropriate person.
> Karen
>
>
>
>
> On 5/9/12 1:25 PM, "Alicia M Klinvex" <aklinvex at purdue.edu> wrote:
>
>   
>> Hello,
>>
>> I'm having a bit of trouble, and I was hoping you could help me.  I want
>> to solve a generalized eigenvalue problem using block Krylov-Schur, which
>> requires a linear solver in Amesos.  I would like to use a distributed
>> memory solver such as Amesos_Superludist or Amesos_Mumps, both of which
>> depend on ParMETIS.  I installed and tested ParMETIS, which has been
>> working fine on its own.  In my configure script, I turned on ParMETIS
>> and set the paths, and when I run the script, Trilinos seems to find
>> everything necessary:
>>
>> TRIBITS_TPL_DECLARE_LIBRARIES: ParMETIS
>> -- PARSE_REQUIRED_HEADERS='parmetis.h'
>> -- PARSE_REQUIRED_LIBS_NAMES='parmetis;metis'
>> -- TPL_ParMETIS_INCLUDE_DIRS=''
>> -- TPL_ParMETIS_LIBRARIES=''
>> -- ParMETIS_LIBRARY_DIRS='/u/slotnick_s2/aklinvex/ParMetis-3.0'
>> -- ParMETIS_LIBRARY_NAMES='libmetis.a; libparmetis.a'
>> -- PARSE_REQUIRED_LIBS_NAMES='libmetis.a; libparmetis.a'
>> -- ParMETIS_LIBRARY_DIRS='/u/slotnick_s2/aklinvex/ParMetis-3.0'
>> -- ParMETIS_INCLUDE_DIRS='/u/slotnick_s2/aklinvex/ParMetis-3.0'
>> -- PARSE_REQUIRED_HEADERS='parmetis.h'
>> -- LIBNAME_SET='libmetis.a'
>> -- 
>> _ParMETIS_libmetis.a_LIBRARY='/u/slotnick_s2/aklinvex/ParMetis-3.0/libmeti
>> s.a'
>> -- LIBNAME_SET=' libparmetis.a'
>> -- 
>> _ParMETIS_libparmetis.a_LIBRARY='/u/slotnick_s2/aklinvex/ParMetis-3.0/libp
>> armetis.a'
>> -- INCLUDE_FILE='parmetis.h'
>> -- _ParMETIS_parmetis.h_PATH='/u/slotnick_s2/aklinvex/ParMetis-3.0'
>> -- TPL_ParMETIS_LIBRARY_DIRS=''
>>
>> However, when I type "make", I get many errors of the following form:
>>
>> [ 60%] Built target
>> ThyraEpetraExtAdapters_EpetraExtDiagScalingTransformer_UnitTests
>> /u/slotnick_s2/aklinvex/ParMetis-3.0/libparmetis.a(initpart.o): In
>> function `Moc_InitPartition_RB__':
>> initpart.c:(.text+0x840): undefined reference to
>> `METIS_mCPartGraphRecursive2'
>> initpart.c:(.text+0x12f5): undefined reference to
>> `METIS_mCPartGraphRecursive2'
>> initpart.c:(.text+0x17da): undefined reference to `METIS_WPartGraphKway2'
>> initpart.c:(.text+0x1887): undefined reference to `METIS_WPartGraphKway2'
>> /u/slotnick_s2/aklinvex/ParMetis-3.0/libparmetis.a(initbalance.o): In
>> function `Balance_Partition__':
>> initbalance.c:(.text+0x1184): undefined reference to
>> `METIS_mCPartGraphRecursive2'
>> initbalance.c:(.text+0x1381): undefined reference to
>> `METIS_mCPartGraphRecursive2'
>> initbalance.c:(.text+0x1a55): undefined reference to
>> `METIS_WPartGraphKway2'
>> initbalance.c:(.text+0x1afb): undefined reference to
>> `METIS_WPartGraphKway2'
>> /u/slotnick_s2/aklinvex/ParMetis-3.0/libparmetis.a(initmsection.o): In
>> function `InitMultisection__':
>> initmsection.c:(.text+0xbd9): undefined reference to
>> `METIS_EdgeComputeSeparator'
>> initmsection.c:(.text+0xc3c): undefined reference to
>> `METIS_NodeComputeSeparator'
>>
>> Trilinos seems to forget where METIS is.  I have tried it with ParMETIS
>> 3.0 and 4.0.2, both of which behave the same way.  Do you know what could
>> be causing this problem?
>>
>> Thank you,
>> Alicia Klinvex
>> aklinvex at purdue.edu
>>
>> _______________________________________________
>> Trilinos-Users mailing list
>> Trilinos-Users at software.sandia.gov
>> http://software.sandia.gov/mailman/listinfo/trilinos-users
>>     
>
>
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://software.sandia.gov/pipermail/trilinos-users/attachments/20120518/36cc580f/attachment.html 


More information about the Trilinos-Users mailing list