[Trilinos-Users] Trilinos-Users Digest, Vol 101, Issue 9

Riccardo Rossi rrossi at cimne.upc.edu
Wed Jan 22 00:53:30 MST 2014


Just as a comment with respect to eventual future Epetra-Tpetra interfaces,

in my opinion it would be rather nice-to-have a performant (ideally
sero-copy) translation of Epetra_CRS to Tpetra, particularly since as i
understand TPetra misses the FECrs matrix capabilities which are very very
handy for FE programs.

having said this i shall also say that i never used TPetra, but indeed it
would be nice to have a simple way to try out eventual performance
improvements without writing a lot of code...

greetings
Riccardo



On Tue, Jan 21, 2014 at 8:00 PM, <trilinos-users-request at software.sandia.gov
> wrote:

> Send Trilinos-Users mailing list submissions to
>         trilinos-users at software.sandia.gov
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://software.sandia.gov/mailman/listinfo/trilinos-users
> or, via email, send a message with subject or body 'help' to
>         trilinos-users-request at software.sandia.gov
>
> You can reach the person managing the list at
>         trilinos-users-owner at software.sandia.gov
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Trilinos-Users digest..."
>
>
> Today's Topics:
>
>    1. Re: Simple way to convert Epetra CrsMatrix to     Tpetra
>       CrsMatrix (Hoemmen, Mark)
>    2. Re: aztec00 problem (Heroux, Mike)
>    3. Re: [EXTERNAL] Re:  Trilinos Build Problem
>       (Gary.Myers.Contractor at unnpp.gov)
>    4. CMAKE issues in PyTrilinos (Bunde, Kermit A)
>    5. Re: aztec00 problem (Matthias Heil)
>    6. Re: aztec00 problem (Heroux, Mike)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 20 Jan 2014 21:03:37 +0000
> From: "Hoemmen, Mark" <mhoemme at sandia.gov>
> Subject: Re: [Trilinos-Users] Simple way to convert Epetra CrsMatrix
>         to      Tpetra CrsMatrix
> To: "<trilinos-users at software.sandia.gov>"
>         <trilinos-users at software.sandia.gov>
> Message-ID: <FFFE87AD-71A4-4BEC-88C9-E80F68155C69 at sandia.gov>
> Content-Type: text/plain; charset="Windows-1252"
>
> On Jan 20, 2014, at 12:00 PM, <trilinos-users-request at software.sandia.gov>
>  <trilinos-users-request at software.sandia.gov> wrote:
> > Date: Mon, 20 Jan 2014 14:02:08 +0000
> > From: Timo Betcke <t.betcke at ucl.ac.uk>
> > Subject: [Trilinos-Users] Simple way to convert Epetra CrsMatrix to
> >       Tpetra  CrsMatrix
> > To: trilinos-users at software.sandia.gov
> >
> > Dear Trilinos Users,
> >
> > I am starting to move some code from Epetra to Tpetra. To begin I'd like
> to
> > create Tpetra Crs matrices given Epetra Crs matrices as input. Is there a
> > more simple way to do this rather than iterating through the rows of the
> > Epetra matrix and copy elements over to a Tpetra matrix?
>
> Hi Timo --
>
> You could use the setAllValues() and expertStaticFillComplete() methods of
> Tpetra::CrsMatrix to speed up the process.  I won't say that this is
> "simpler" but it may perform better.  Here is how it would work:
>
> 1. Create the Epetra_CrsMatrix A_e.
>
> 2. Call FillComplete on A_e.
>
> 3. Extract raw pointers to the data in A_e, and their lengths.
>
> int* ptr;
> int* ind;
> double* val;
>
> int info = A_e.ExtractCrsDataPointers (&ptr, &ind, &val);
> if (info != 0) {
>   // ? report error and exit ?
> }
> const int numRows = A_e.Graph ().NumMyRows ();
> const int nnz = A_e.Graph ().NumMyEntries ();
>
> 4. Copy data into new Teuchos::ArrayRCP arrays.  (Note that ptr2 and ptr
> store data of different types.)
>
> Teuchos::ArrayRCP<size_t> ptr2 (numRows+1);
> Teuchos::ArrayRCP<int> ind2 (nnz);
> Teuchos::ArrayRCP<double> val2 (nnz);
>
> std::copy (ptr, ptr + numRows + 1, ptr2.begin ());
> std::copy (ind, ind + nnz, ind2.begin ());
> std::copy (val, val + nnz, val2.begin ());
>
> 5. Create a Tpetra::CrsMatrix A_t with the appropriate row (and column)
> Map(s), just as you created A_e above.
>
> 6. Give A_t the new data arrays:
>
> A_t.setAllValues (ptr2, ind2, val2);
>
> 7. Call fillComplete on A_t.  You may also use expertStaticFillComplete if
> you already have Import (and Export) objects constructed.
>
> If there is sufficient interest, we may also provide a Tpetra::RowMatrix
> interface to an Epetra_CrsMatrix object.  This would let you use Ifpack2
> with Epetra objects.
>
> mfh
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 20 Jan 2014 21:18:01 +0000
> From: "Heroux, Mike" <MHeroux at csbsju.edu>
> Subject: Re: [Trilinos-Users] aztec00 problem
> To: Matthias Heil <matthias.heil at manchester.ac.uk>,
>         "trilinos-bugs at software.sandia.gov"
>         <trilinos-bugs at software.sandia.gov>,
>         "trilinos-users at software.sandia.gov"
>         <trilinos-users at software.sandia.gov>
> Cc: Andrew Hazel <ahazel at maths.manchester.ac.uk>,       David Wilke
>         <david.wilke at student.adelaide.edu.au>
> Message-ID: <CF02E861.B2724%mheroux at csbsju.edu>
> Content-Type: text/plain; charset="us-ascii"
>
> Matthias,
>
> Do you have a sense of whether or not the data sizes you are using would
> result in array indexing that exceed 2.1 billion?  The kinds of issues you
> are seeing would be consistent with trying to address an array using an
> integer value that is bigger than what signed int can handle.
>
> The hex values you are printing are very large (more than 140 trillion),
> which seems to indicate an incorrect address calculation somewhere.  I
> agree that the memory manager should detect the issue, no matter what.
>
> Mike
>
> On 1/20/14 8:24 AM, "Matthias Heil" <matthias.heil at manchester.ac.uk>
> wrote:
>
> >Hi,
> >
> >   we've come across a possible bug in trilinos aztecoo.
> >The code seg faults when trying to execute the line
> >
> >    *dst_ptr++ = s;
> >
> >in
> >
> >trilinos-11.4.3-Source/packages/epetra/src/Epetra_CrsMatrix.cpp:3327
> >
> >An attempt to de-reference that pointer (in ddd) shows:
> >
> >(gdb) print *dst_ptr
> >Cannot access memory at address 0x8001cb466110
> >
> >Moving back through the call stack shows that the
> >memory is initially allocated in AZ_manage_memory(...)
> >which is called from just under
> >
> >    trilinos-11.4.3-Source/packages/aztecoo/src/az_gmres.c:239
> >
> >The problem arises only for large values of kspace which is related
> >to the max number of iterations. We've set this to a rather large
> >value of 5000 (We don't usually need that many, BUT the code
> >should hopefully still be able to handle this or fail
> >gracefully. Things work ok for smaller values, e.g kspace=1000).
> >
> >Following the return from this call, the memory allocated in
> >AZ_manage_memory(...) gets distributed into two vectors, hh
> >and v, and it's v that contains the illegal memory address:
> >Placing a breakpoint in
> >
> >    trilinos-11.4.3-Source/packages/aztecoo/src/az_gmres.c:248
> >
> >(just after that loop) and interrogating various values of v yields:
> >
> >(gdb) print v[5000]
> >$1 = (double *) 0x8001cb466110
> >
> >and, predictably:
> >
> >(gdb) print *v[5000]
> >Cannot access memory at address 0x8001cb466110
> >
> >whereas
> >
> >(gdb) print *v[500]
> >$4 = 0
> >
> >is fine.
> >
> >Trial and error shows that things go wrong beyond entry 518:
> >
> >(gdb) print *v[519]
> >Cannot access memory at address 0x7ffff0bf14f0
> >(gdb) print *v[518]
> >$8 = 0
> >
> >Further information:
> >
> >   -- All code was completely built from source, using gcc
> >      without optimisation and with -g.
> >
> >   -- Based on a (small) sample of machines, the problem only
> >      arises on 64 bit machines (not 32)
> >
> >   -- The problem only arises for sufficiently big problem sizes
> >      (though they are still way short of the machines' total
> >      available memory). When running on a machine with very
> >      little memory, the call to AZ_manage_memory(...) fails
> >      gracefully with the "maybe you should try a smaller problem"
> >      message.
> >
> >   -- The problem arises with both serial and parallel installations
> >      (i.e. when the code is compiled with and without mpi support)
> >      and with different trilinos releases.
> >
> >  -- The problem is difficult to isolate further since we use
> >      trilinos from within our own big library (which provides the
> >      preconditioner). Note that our code works fine if we use our
> >      own (serial) GMRES solver (or a direct solver).
> >
> >   Does any of this ring a bell?
> >
> >      Happy to run further tests here or provide additional diagnostic
> >information.
> >
> >      Best wishes,
> >
> >              Matthias
> >
> >--
> >--------------------------------------------------------------------------
> >-
> >Professor Matthias Heil
> >
> >Alan Turing Building, Room 2.224
> >School of Mathematics           Tel. +44 (0)161 275 5808
> >University of Manchester        Fax. +44 (0)161 275 5819
> >Oxford Road                     email: M.Heil at maths.man.ac.uk
> >Manchester M13 9PL              WWW: http://www.maths.man.ac.uk/~mheil/
> >U.K.
> >
> >NEWS:   The beta release of oomph-lib, the object-oriented
> >         multi-physics finite-element library is now available
> >         as free open-source software at
> >
> >             http://www.oomph-lib.org
> >
> >--------------------------------------------------------------------------
> >-
> >
> >_______________________________________________
> >Trilinos-Users mailing list
> >Trilinos-Users at software.sandia.gov
> >http://software.sandia.gov/mailman/listinfo/trilinos-users
>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 21 Jan 2014 09:33:47 -0500
> From: <Gary.Myers.Contractor at unnpp.gov>
> Subject: Re: [Trilinos-Users] [EXTERNAL] Re:  Trilinos Build Problem
> To: <Gary.Myers.Contractor at unnpp.gov>, <bmpersc at sandia.gov>,
>         <trilinos-users at software.sandia.gov>
> Message-ID:
>         <7FC4686EDC94604991223BE416ADD1DA0D3BF0 at UIBETPEXV1.ias.unrpnet.gov
> >
> Content-Type: text/plain;       charset="us-ascii"
>
> Brent,
>
> As you suspected, there is a conflict with a older version of Trilinos
> which was installed in the same area as my referenced to BOOST in the
> build of Trilinos 11.4.2.
>
> I managed to separate the BOOST from the old Trilinos installation and
> test against the simple Trilinos 11.4.2 combined teuchos and sacado
> package build with success.
>
> THANKS VERY MUCH FOR YOUR HELP!
>
> Gary
>
> -----Original Message-----
> From: trilinos-users-bounces at software.sandia.gov
> [mailto:trilinos-users-bounces at software.sandia.gov] On Behalf Of
> Gary.Myers.Contractor at unnpp.gov
> Sent: Monday, January 20, 2014 1:05 PM
> To: bmpersc at sandia.gov; trilinos-users at software.sandia.gov
> Subject: Re: [Trilinos-Users] [EXTERNAL] Re: Trilinos Build Problem
>
> Brent,
>
> I am in the process of getting the full verbose compile over to our
> computers which interface with the internet.
>
> In the meantime, I have successfully built and test teuchos by itself
> from Trilinos 11.4.2.  Then I attempted to just build and test teuchos
> and sacado.  This attempt failed in the manner that was consistent with
> what I have been getting with the full build.  I looked at the
> Teuchos_config.h file and the TEUCHOSCORE_LIB_DLL_EXPORT macros is not
> defined.
>
> Now, there is a possibility that my environmental pathnames could be
> pulling in an older version.  Let me check into this and see if this is
> causing an issue.
>
> I will get back to you on this.
>
> Thanks,
>
> Gary
>
> -----Original Message-----
> From: Perschbacher, Brent M [mailto:bmpersc at sandia.gov]
> Sent: Monday, January 20, 2014 12:29 PM
> To: Myers, Gary T (Contractor); bundeka at id.doe.gov;
> trilinos-users at software.sandia.gov
> Subject: Re: [EXTERNAL] Re: [Trilinos-Users] Trilinos Build Problem
>
> Gary,
>   That macro being empty is fine, in fact it is expected to be empty
> more often than not. However, the error in question is stating that the
> macro isn't defined, even as empty. Since the preprocessor didn't
> replace it the compiler is trying to figure it out based on context and
> isn't able to.
> Getting the build files from you would help greatly specifically the
> full verbose compile line and error message. I'm not sure if this is the
> cause of your issues, but do you have a previously installed copy of
> Trilinos somewhere on your machine? Some versions of linux have been
> distributing older versions of Trilinos for a while now. It is possible
> that an older copy of Teuchos_DLLExport_Macro.h is being picked up
> instead of the copy in your source tree. When teuchos was refactored
> into subpackages that file was changed to no longer be a generated file,
> but instead to use a static copy. This was the solution we settled on
> for teuchos when we discovered how the subpackages interacted with our
> windows DLL build.
>
> Brent
>
> On 1/17/14 10:15 AM, "Gary.Myers.Contractor at unnpp.gov"
> <Gary.Myers.Contractor at unnpp.gov> wrote:
>
> >Brent and Kermit,
> >
> >Thanks for the suggestions.  Although its goes against what I want in
> >my final product, I did set BUILD_SHARED_LIBS to OFF.  However, I still
>
> >get a similar error in Teuchos.  In either case for BUILD_SHARED_LIBS,
> >I think TEUCHOSCORE_LIB_DLL_EXPORT is being defined, but as an empty
> >macro.  However, because this macro is being used in some declarations,
>
> >it is producing compiling errors. It's not clear to me how I am digging
>
> >myself into this hole, so I am in the process of getting output files
> >over to a computer so I can share with everyone.  In the mean time,
> >here is my configuration command line which I am typing into this
> >e-mail (so there may be some typos). Because many items are referenced
> >from nonstandard areas, they need to be explicitly defined in the cmake
>
> >commandline.
> >
> >cmake -D BUILD_SHARED_LIBS:BOOL=ON \
> >       -D CMAKE_BUILD_TYPE:STRING='Rlease' \
> >       -D CMAKE_INSTALL_PREFIX:PATH (path name here) \
> >       -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=ON \
> >       -D Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=ON \
> >       -D Trilinos_ENABLE_TESTS:BOOL=ON \
> >       -D Trilinos_ENABLE_SECONDARY_STABLE_CODE:BOOL=ON \
> >       -D Trilinos_ENABLE_Fortran:BOOL=ON \
> >       -D TPL_ENABLE_MPI:BOOL=ON \
> >       -D MPI_BASE_DIR:PATH= (path name here) \
> >       -D MPI_EXEC:FILEPATH= (path name here) \
> >       -D MPI_C_COMPILER:FILEPATH= (path name to openmpi 1.4.5 mpicc) \
> >       -D MPI_CXX_COMPILER:FILEPATH= (path name to openmpi 1.4.5
> >mpicxx) \
> >       -D MPI_Fortran_COMPILER:FILEPATH= (path name to openmpi 1.4.5
> >mpif90) \
> >       -D TPL_ENABLE_Boost:BOOL=ON \
> >       -D Boost_LIBRARY_DIRS:PATH= (path name here) \
> >       -D Boost_INCLUDE_DIRS:PATH= (path name here) \
> >       -D TPL_ENABLE_MKL:BOOL=ON \
> >       -D MKL_LIBRARY_DIRS:FILEPATH= (path name to intel64 libraries) \
> >       -D MKL_INCLUDE_DIRS:FILEPATH= (path name to intel64 includes) \
> >       -D MKL_LIBRARY_NAMES:STRING='mkl_rt' \
> >       -D DOXYGEN_EXECUTABLE:FILEPATH= (path name) \
> >       -D Netcdf_INCLUDE_DIRS:PATH= (path name) \
> >       -D Netcdf_LIBRARY_DIRS:PATH= (path name) \
> >       -D Matio_INCLUDE_DIRS:PATH= (path name) \
> >       -D Matio_LIBRARY_DIRS:PATH= (path name) \
> >       -D PYTHON_EXECUTABLE:FILEPATH= (path name to python 2.7.5) \
> >       -D PYTHON_INCLUDE_DIRS:PATH= (path name to python 2.7.5
> >includes) \
> >       -D PYTHON_LIBRARIES:PATH= (path list to python 2.7 libraries) \
> >       -D BLAS_LIBRARY_DIRS:PATH= (path to intel64 library ) \
> >       -D BLAS_LIBRARY_NAMES:STRING='mkl_rt' \
> >       -D LAPACK_LIBRARY_DIRS:PATH= (path to intel64 library ) \
> >       -D LAPACK_LIBRARY_NAMES:STRING= 'mkl_rt' \
> >       (path name to source distribution for trilinos 11.4.2)
> >
> >Error occurs in line 82 of Teuchos_ScalarTraits.hpp - this declaration
> >has no storage class or type specifier
> >       TEUCHOSCORE_LIB_DLL_EXPORT
> >
> >Error occurs in line 83 of Teuchos_ScalarTrailts.hpp - expected a ;
> >       Void throwScalarTraitsNanInfError( const std::string &errMsg );
> >
> >. . . . etc. . . .
> >
> >Please note that if I disable Teuchos I am successful, but then I lose
> >too many other packages which are important to me.  Any additional
> >guidance of how I can dig out of this hole would be great.
> >
> >Regards,
> >
> >Gary T. Myers
> >Principal Scientist
> >Bechtel Marine Propulsion Corp.
> >Gary.Myers.contractor at unnpp.gov
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Perschbacher, Brent M [mailto:bmpersc at sandia.gov]
> >Sent: Thursday, January 16, 2014 4:00 PM
> >To: Myers, Gary T (Contractor); bundeka at id.doe.gov;
> >trilinos-users at software.sandia.gov
> >Subject: Re: [EXTERNAL] Re: [Trilinos-Users] Trilinos Build Problem
> >
> >Gary,
> >  You really don't want to force _WIN32 to be set if you aren't on a
> >windows machine creating a shared build. That macro is used to specify
> >which classes/functions will be publicly available inside DLL's.
> >However, it should be defined but as empty on non-windows machines. The
>
> >last few lines of the Teuchos_DLLExportMacro.h should have that
> >assignment to empty. I have an idea what might be causing this, but I'm
>
> >not sure why it wouldn't have caused us trouble before if it is the
> >case. I will look into it. In the mean time you can either build a
> >static build (ie don't set BUILD_SHARED_LIBS to ON) or you could add
> >the empty define manually with CMAKE_CXX_FLAGS and CMAKE_C_FLAGS.
> >
> >Brent
> >
> >On 1/16/14 1:30 PM, "Gary.Myers.Contractor at unnpp.gov"
> ><Gary.Myers.Contractor at unnpp.gov> wrote:
> >
> >>Kermit,
> >>
> >>Yes, I already have this set.  But, I feel that _WIN32 is not set
> >>(which my intuition suggest is correct) since I am not on a WIN 32
> >>based computer.  I am on a Linux 64 bit cluster.  I guess I could
> >>force
> >
> >>the
> >>_WIN32 to be defined and see what happens.
> >>
> >>Gary
> >>
> >>
> >>
> >>-----Original Message-----
> >>From: Bunde, Kermit A [mailto:bundeka at id.doe.gov]
> >>Sent: Thursday, January 16, 2014 3:02 PM
> >>To: Myers, Gary T (Contractor)
> >>Subject: RE: [Trilinos-Users] Trilinos Build Problem
> >>
> >>Gary,
> >>
> >>
> >>
> >>Please ignore my last reply.  In doing a little google searching:
> >>
> >>
> >>
> >>http://msdn.microsoft.com/en-us/library/b0084kay.aspx lists
> >>
> >>
> >>
> >>_WIN32 as a predefined macro.
> >>
> >>
> >>
> >>
> >>
> >>Try adding this to your do_configure script:
> >>
> >>
> >>
> >>  -D BUILD_SHARED_LIBS:BOOL=ON \
> >>
> >>Kermit Bunde
> >>Enforcement Coordinator
> >>Criticality Safety SME
> >>Nuclear Safety SME
> >>DOE-ID Aviation Safety Officer
> >>208-526-5188 (office)
> >>208-526-1926 (fax)
> >>208-680-6843 (cell)
> >>"Accept the challenges so that you may feel the exhilaration of
> >>victory."
> >>
> >>Never tell people how to do things. Tell them what to do and they will
>
> >>surprise you with their ingenuity."
> >>
> >>--George S. Patton Jr.,
> >>American Army general
> >>
> >>
> >>
> >>From: trilinos-users-bounces at software.sandia.gov
> >>[mailto:trilinos-users-bounces at software.sandia.gov] On Behalf Of
> >>Gary.Myers.Contractor at unnpp.gov
> >>Sent: Thursday, January 16, 2014 8:42 AM
> >>To: trilinos-users at software.sandia.gov
> >>Subject: [Trilinos-Users] Trilinos Build Problem
> >>
> >>
> >>
> >>Hi,
> >>
> >>
> >>
> >>Newbie question here: First time build of Trilinos using CMake.
> >>
> >>
> >>
> >>I am building Trilinos 11.4.2 on a Linux Cluster using Intel
> >>compilers;
> >
> >>openMPI, MKL, . . .
> >>
> >>
> >>
> >>CMake configuration is completed, but Teuchos fails  to compile
> >>because
> >
> >>macro TEUCHOSCORE_LIB_DLL_EXPORT is not explicitly define.  The first
> >>reference occurs on line 82 of Teuchos_ScalarTraits.hpp.
> >>
> >>
> >>
> >>In looking at Teuchos_DLLExportMacro.h, I could see how this could
> >>happen since _WIN32 probably is not defined.
> >>
> >>
> >>
> >>Can someone suggest how I can get past this (please note that I am not
>
> >>in a position to easily share files since the Linux Cluster is not on
> >>the grid)?
> >>
> >>
> >>
> >>Thanks,
> >>
> >>
> >>
> >>Gary T. Myers
> >>
> >>Principal Scientist
> >>
> >>Bechtel Marine Propulsion Corp.
> >>
> >>Gary.Myers.contractor at unnpp.gov
> >>
> >>
> >>_______________________________________________
> >>Trilinos-Users mailing list
> >>Trilinos-Users at software.sandia.gov
> >>http://software.sandia.gov/mailman/listinfo/trilinos-users
> >
>
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
>
>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 21 Jan 2014 14:52:37 +0000
> From: "Bunde, Kermit A" <bundeka at id.doe.gov>
> Subject: [Trilinos-Users] CMAKE issues in PyTrilinos
> To: "'trilinos-users at software.sandia.gov'"
>         <trilinos-users at software.sandia.gov>
> Message-ID:
>         <1A714649FC7A5A44ACF6FE45DAFD86275823471D at IDXCH1.id.doe.lcl>
> Content-Type: text/plain; charset="us-ascii"
>
> I believe that there is an error in the code fragment from the
> CMakeLists.txt file in "trilinos-11.4.3-Source\packages\PyTrilinos\src "
> directory below:
>
> #
> # On Mac OS X Gnu compilers, add dynamic lookup for undefined symbols
> # to the pytrilinos library and PyTrilinos extension modules
> SET(EXTRA_LINK_ARGS "${CMAKE_SHARED_LINKER_FLAGS}")
> IF(APPLE)
>   IF((CMAKE_CXX_COMPILER_ID MATCHES) "GNU" OR (CMAKE_CXX_COMPILER_ID
> MATCHES "Clang"))
>    SET(EXTRA_LINK_ARGS "${EXTRA_LINK_ARGS} -undefined dynamic_lookup")
>   ENDIF()
> ENDIF(APPLE)
>
>
> This line:
> IF((CMAKE_CXX_COMPILER_ID MATCHES) "GNU" OR (CMAKE_CXX_COMPILER_ID MATCHES
> "Clang"))
>
> Should be:
> IF((CMAKE_CXX_COMPILER_ID MATCHES "GNU") OR (CMAKE_CXX_COMPILER_ID MATCHES
> "Clang"))
>
> Also I am getting a parse error of this form from one of the lines below
> from the same file:
> Parse error. Function missing ending ")".
> Instead found unterminated string with text ")
> ".
> #
> # Add the additional "make clean" files
> GET_DIRECTORY_PROPERTY(clean_files ADDITIONAL_MAKE_CLEAN_FILES)
> LIST(APPEND            clean_files ${ADDITIONAL_CLEAN_FILES})
> LIST(REMOVE_DUPLICATES clean_files)
> LIST(REMOVE_ITEM       clean_files "")
> SET_DIRECTORY_PROPERTIES(PROPERTIES ADDITIONAL_MAKE_CLEAN_FILES
> "${clean_files}")
>
>
> Thanks for your help.
> Kermit Bunde
> Enforcement Coordinator
> Criticality Safety SME
> Nuclear Safety SME
> DOE-ID Aviation Safety Officer
> 208-526-5188 (office)
> 208-526-1926 (fax)
> 208-680-6843 (cell)
> "Accept the challenges so that you may feel the exhilaration of victory."
> Never tell people how to do things. Tell them what to do and they will
> surprise you with their ingenuity."
> --George S. Patton Jr.,
> American Army general
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> https://software.sandia.gov/pipermail/trilinos-users/attachments/20140121/36ae01cf/attachment-0001.html
>
> ------------------------------
>
> Message: 5
> Date: Tue, 21 Jan 2014 15:58:38 +0000
> From: Matthias Heil <matthias.heil at manchester.ac.uk>
> Subject: Re: [Trilinos-Users] aztec00 problem
> To: "Heroux, Mike" <MHeroux at CSBSJU.EDU>
> Cc: Andrew Hazel <ahazel at maths.manchester.ac.uk>,       David Wilke
>         <david.wilke at student.adelaide.edu.au>,
>         "trilinos-users at software.sandia.gov"
>         <trilinos-users at software.sandia.gov>,
>         "trilinos-bugs at software.sandia.gov"
>         <trilinos-bugs at software.sandia.gov>
> Message-ID: <52DE992E.5070901 at manchester.ac.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Mike,
>
>     we've made some progress. After some more digging we've
> established that the offensive call to AZ_manage_memory is
> made with the following arguments:
>
> AZ_manage_memory (input_size=295202080,
>                     action=0,
>                     type=-914901,
>                     name=0x7fffffffd720 "vblock in gmres0",
>                     status=0x7fffffffd6c8)
>
>
> at trilinos-11.4.3-Source/packages/aztecoo/src/az_util.c:944
> when "viewed" from within AZ_manage_memory.
>
> However, looking at the calling code, the first argument, input_size,
> is derived from: kspace=5000; aligned_N_total=222084;
> sizeof(double)=8, so it should be (5000+1)*222084*8=8885136672.
>
> Andrew Hazel then wrote the small test code, below, which shows that
> 295202080 is the value given to an unsigned int that stores the result
> of the calculation. The problem appears to be that the first argument
> to AZ_manage_memory is an unsigned int, rather than an unsigned
> long (or some other custom type).
>
>       Matthias
>
>
> #include <iostream>
>
> int main()
>   {
>    unsigned kspace = 5000;
>    unsigned aligned_N_total = 222084;
>
>    unsigned temp = (kspace+1)*aligned_N_total*sizeof(double);
>    unsigned long temp2 = (kspace+1)*aligned_N_total*sizeof(double);
>
>    std::cout << temp << " " << temp2 << "\n";
>   }
>
>
>
> On 20/01/14 21:18, Heroux, Mike wrote:
> > Matthias,
> >
> > Do you have a sense of whether or not the data sizes you are using would
> > result in array indexing that exceed 2.1 billion?  The kinds of issues
> you
> > are seeing would be consistent with trying to address an array using an
> > integer value that is bigger than what signed int can handle.
> >
> > The hex values you are printing are very large (more than 140 trillion),
> > which seems to indicate an incorrect address calculation somewhere.  I
> > agree that the memory manager should detect the issue, no matter what.
> >
> > Mike
> >
> > On 1/20/14 8:24 AM, "Matthias Heil" <matthias.heil at manchester.ac.uk>
> wrote:
> >
> >> Hi,
> >>
> >>    we've come across a possible bug in trilinos aztecoo.
> >> The code seg faults when trying to execute the line
> >>
> >>     *dst_ptr++ = s;
> >>
> >> in
> >>
> >> trilinos-11.4.3-Source/packages/epetra/src/Epetra_CrsMatrix.cpp:3327
> >>
> >> An attempt to de-reference that pointer (in ddd) shows:
> >>
> >> (gdb) print *dst_ptr
> >> Cannot access memory at address 0x8001cb466110
> >>
> >> Moving back through the call stack shows that the
> >> memory is initially allocated in AZ_manage_memory(...)
> >> which is called from just under
> >>
> >>     trilinos-11.4.3-Source/packages/aztecoo/src/az_gmres.c:239
> >>
> >> The problem arises only for large values of kspace which is related
> >> to the max number of iterations. We've set this to a rather large
> >> value of 5000 (We don't usually need that many, BUT the code
> >> should hopefully still be able to handle this or fail
> >> gracefully. Things work ok for smaller values, e.g kspace=1000).
> >>
> >> Following the return from this call, the memory allocated in
> >> AZ_manage_memory(...) gets distributed into two vectors, hh
> >> and v, and it's v that contains the illegal memory address:
> >> Placing a breakpoint in
> >>
> >>     trilinos-11.4.3-Source/packages/aztecoo/src/az_gmres.c:248
> >>
> >> (just after that loop) and interrogating various values of v yields:
> >>
> >> (gdb) print v[5000]
> >> $1 = (double *) 0x8001cb466110
> >>
> >> and, predictably:
> >>
> >> (gdb) print *v[5000]
> >> Cannot access memory at address 0x8001cb466110
> >>
> >> whereas
> >>
> >> (gdb) print *v[500]
> >> $4 = 0
> >>
> >> is fine.
> >>
> >> Trial and error shows that things go wrong beyond entry 518:
> >>
> >> (gdb) print *v[519]
> >> Cannot access memory at address 0x7ffff0bf14f0
> >> (gdb) print *v[518]
> >> $8 = 0
> >>
> >> Further information:
> >>
> >>    -- All code was completely built from source, using gcc
> >>       without optimisation and with -g.
> >>
> >>    -- Based on a (small) sample of machines, the problem only
> >>       arises on 64 bit machines (not 32)
> >>
> >>    -- The problem only arises for sufficiently big problem sizes
> >>       (though they are still way short of the machines' total
> >>       available memory). When running on a machine with very
> >>       little memory, the call to AZ_manage_memory(...) fails
> >>       gracefully with the "maybe you should try a smaller problem"
> >>       message.
> >>
> >>    -- The problem arises with both serial and parallel installations
> >>       (i.e. when the code is compiled with and without mpi support)
> >>       and with different trilinos releases.
> >>
> >>   -- The problem is difficult to isolate further since we use
> >>       trilinos from within our own big library (which provides the
> >>       preconditioner). Note that our code works fine if we use our
> >>       own (serial) GMRES solver (or a direct solver).
> >>
> >>    Does any of this ring a bell?
> >>
> >>       Happy to run further tests here or provide additional diagnostic
> >> information.
> >>
> >>       Best wishes,
> >>
> >>               Matthias
> >>
> >> --
> >>
> --------------------------------------------------------------------------
> >> -
> >> Professor Matthias Heil
> >>
> >> Alan Turing Building, Room 2.224
> >> School of Mathematics           Tel. +44 (0)161 275 5808
> >> University of Manchester        Fax. +44 (0)161 275 5819
> >> Oxford Road                     email: M.Heil at maths.man.ac.uk
> >> Manchester M13 9PL              WWW: http://www.maths.man.ac.uk/~mheil/
> >> U.K.
> >>
> >> NEWS:   The beta release of oomph-lib, the object-oriented
> >>          multi-physics finite-element library is now available
> >>          as free open-source software at
> >>
> >>              http://www.oomph-lib.org
> >>
> >>
> --------------------------------------------------------------------------
> >> -
> >>
> >> _______________________________________________
> >> Trilinos-Users mailing list
> >> Trilinos-Users at software.sandia.gov
> >> http://software.sandia.gov/mailman/listinfo/trilinos-users
>
> --
> ---------------------------------------------------------------------------
> Professor Matthias Heil
>
> Alan Turing Building, Room 2.224
> School of Mathematics           Tel. +44 (0)161 275 5808
> University of Manchester        Fax. +44 (0)161 275 5819
> Oxford Road                     email: M.Heil at maths.man.ac.uk
> Manchester M13 9PL              WWW: http://www.maths.man.ac.uk/~mheil/
> U.K.
>
> NEWS:   The beta release of oomph-lib, the object-oriented
>          multi-physics finite-element library is now available
>          as free open-source software at
>
>              http://www.oomph-lib.org
>
> ---------------------------------------------------------------------------
>
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Tue, 21 Jan 2014 17:05:20 +0000
> From: "Heroux, Mike" <MHeroux at csbsju.edu>
> Subject: Re: [Trilinos-Users] aztec00 problem
> To: Matthias Heil <matthias.heil at manchester.ac.uk>
> Cc: Andrew Hazel <ahazel at maths.manchester.ac.uk>,       David Wilke
>         <david.wilke at student.adelaide.edu.au>,
>         "trilinos-bugs at software.sandia.gov"
>         <trilinos-bugs at software.sandia.gov>,
>         "trilinos-users at software.sandia.gov"
>         <trilinos-users at software.sandia.gov>
> Message-ID: <CF04042E.B28A6%mheroux at csbsju.edu>
> Content-Type: text/plain; charset="us-ascii"
>
> Matthias,
>
> AztecOO has been "upgraded" to handle larger problems.  We can now use it
> for problems where the global integer data is "long long", but we decided
> to avoid a complete transition to 64-bit ints.  Instead, the Belos
> package, along with Tpetra is our long-term solution for this issue.
>
> I am recording the result of this conversation with trilinos-bugs, so we
> can get the fix on the queue.
>
> Thanks.
>
> Mike
>
> On 1/21/14 12:00 PM, "Matthias Heil" <matthias.heil at manchester.ac.uk>
> wrote:
>
> >Mike,
> >
> >   thanks for your quick reply. I agree that if this
> >
> >"AztecOO is not designed to work with problems
> >where the size of any local data objects is beyond the range of signed
> >32-bit ints."
> >
> >is the policy (which is sensible -- if potentially inconvenient/confusing
> >for the user who can't necessarily assess what happens
> >internally) then it's not a bug. I'd noticed some recent changes
> >to aztec's source code where ints had been upgraded to long ints,
> >and inferred (wrongly!) that there was a general attempt to allow
> >it to handle bigger problems. I had hoped that we might simply
> >need a similar tweak here but do realise that there would almost
> >certainly be additional problems lurking further "downstream".
> >
> >However, adding some internal sanity checking that issues warnings
> >(or aborts) if this problem arises would be VERY helpful. Took us
> >quite a while to get to the bottom of this...
> >
> >   Anyway, with your explanation I'm happy to regard this as
> >resolved...
> >
> >    Thanks for the quick feedback (and the great code!).
> >
> >      Best wishes,
> >
> >        Matthias
> >
> >
> >
> >
> >On 21/01/14 16:26, Heroux, Mike wrote:
> >> Matthias,
> >>
> >> Just to make sure I understand:  This is a real overflow of range, so
> >>the
> >> issue is not a bug in the correct execution of AztecOO, but in not
> >> detecting the memory error.  AztecOO is not designed to work with
> >>problems
> >> where the size of any local data objects is beyond the range of signed
> >> 32-bit ints.
> >>
> >> It seems that we could add a quick check to AZ_manage_memory that would
> >> copy the input_size value into a signed int and then compare the result,
> >> or something similar.
> >>
> >> Is this the kind of fix you could use?  Please let me know if I am
> >>missing
> >> the point.
> >>
> >> Thanks.
> >>
> >> Mike
> >>
> >> On 1/21/14 9:58 AM, "Matthias Heil" <matthias.heil at manchester.ac.uk>
> >>wrote:
> >>
> >>> Mike,
> >>>
> >>>     we've made some progress. After some more digging we've
> >>> established that the offensive call to AZ_manage_memory is
> >>> made with the following arguments:
> >>>
> >>> AZ_manage_memory (input_size=295202080,
> >>>                     action=0,
> >>>                     type=-914901,
> >>>                     name=0x7fffffffd720 "vblock in gmres0",
> >>>                     status=0x7fffffffd6c8)
> >>>
> >>>
> >>> at trilinos-11.4.3-Source/packages/aztecoo/src/az_util.c:944
> >>> when "viewed" from within AZ_manage_memory.
> >>>
> >>> However, looking at the calling code, the first argument, input_size,
> >>> is derived from: kspace=5000; aligned_N_total=222084;
> >>> sizeof(double)=8, so it should be (5000+1)*222084*8=8885136672.
> >>>
> >>> Andrew Hazel then wrote the small test code, below, which shows that
> >>> 295202080 is the value given to an unsigned int that stores the result
> >>> of the calculation. The problem appears to be that the first argument
> >>> to AZ_manage_memory is an unsigned int, rather than an unsigned
> >>> long (or some other custom type).
> >>>
> >>>       Matthias
> >>>
> >>>
> >>> #include <iostream>
> >>>
> >>> int main()
> >>>   {
> >>>    unsigned kspace = 5000;
> >>>    unsigned aligned_N_total = 222084;
> >>>
> >>>    unsigned temp = (kspace+1)*aligned_N_total*sizeof(double);
> >>>    unsigned long temp2 = (kspace+1)*aligned_N_total*sizeof(double);
> >>>
> >>>    std::cout << temp << " " << temp2 << "\n";
> >>>   }
> >>>
> >>>
> >>>
> >>> On 20/01/14 21:18, Heroux, Mike wrote:
> >>>> Matthias,
> >>>>
> >>>> Do you have a sense of whether or not the data sizes you are using
> >>>>would
> >>>> result in array indexing that exceed 2.1 billion?  The kinds of issues
> >>>> you
> >>>> are seeing would be consistent with trying to address an array using
> >>>>an
> >>>> integer value that is bigger than what signed int can handle.
> >>>>
> >>>> The hex values you are printing are very large (more than 140
> >>>>trillion),
> >>>> which seems to indicate an incorrect address calculation somewhere.  I
> >>>> agree that the memory manager should detect the issue, no matter what.
> >>>>
> >>>> Mike
> >>>>
> >>>> On 1/20/14 8:24 AM, "Matthias Heil" <matthias.heil at manchester.ac.uk>
> >>>> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>>     we've come across a possible bug in trilinos aztecoo.
> >>>>> The code seg faults when trying to execute the line
> >>>>>
> >>>>>      *dst_ptr++ = s;
> >>>>>
> >>>>> in
> >>>>>
> >>>>> trilinos-11.4.3-Source/packages/epetra/src/Epetra_CrsMatrix.cpp:3327
> >>>>>
> >>>>> An attempt to de-reference that pointer (in ddd) shows:
> >>>>>
> >>>>> (gdb) print *dst_ptr
> >>>>> Cannot access memory at address 0x8001cb466110
> >>>>>
> >>>>> Moving back through the call stack shows that the
> >>>>> memory is initially allocated in AZ_manage_memory(...)
> >>>>> which is called from just under
> >>>>>
> >>>>>      trilinos-11.4.3-Source/packages/aztecoo/src/az_gmres.c:239
> >>>>>
> >>>>> The problem arises only for large values of kspace which is related
> >>>>> to the max number of iterations. We've set this to a rather large
> >>>>> value of 5000 (We don't usually need that many, BUT the code
> >>>>> should hopefully still be able to handle this or fail
> >>>>> gracefully. Things work ok for smaller values, e.g kspace=1000).
> >>>>>
> >>>>> Following the return from this call, the memory allocated in
> >>>>> AZ_manage_memory(...) gets distributed into two vectors, hh
> >>>>> and v, and it's v that contains the illegal memory address:
> >>>>> Placing a breakpoint in
> >>>>>
> >>>>>      trilinos-11.4.3-Source/packages/aztecoo/src/az_gmres.c:248
> >>>>>
> >>>>> (just after that loop) and interrogating various values of v yields:
> >>>>>
> >>>>> (gdb) print v[5000]
> >>>>> $1 = (double *) 0x8001cb466110
> >>>>>
> >>>>> and, predictably:
> >>>>>
> >>>>> (gdb) print *v[5000]
> >>>>> Cannot access memory at address 0x8001cb466110
> >>>>>
> >>>>> whereas
> >>>>>
> >>>>> (gdb) print *v[500]
> >>>>> $4 = 0
> >>>>>
> >>>>> is fine.
> >>>>>
> >>>>> Trial and error shows that things go wrong beyond entry 518:
> >>>>>
> >>>>> (gdb) print *v[519]
> >>>>> Cannot access memory at address 0x7ffff0bf14f0
> >>>>> (gdb) print *v[518]
> >>>>> $8 = 0
> >>>>>
> >>>>> Further information:
> >>>>>
> >>>>>     -- All code was completely built from source, using gcc
> >>>>>        without optimisation and with -g.
> >>>>>
> >>>>>     -- Based on a (small) sample of machines, the problem only
> >>>>>        arises on 64 bit machines (not 32)
> >>>>>
> >>>>>     -- The problem only arises for sufficiently big problem sizes
> >>>>>        (though they are still way short of the machines' total
> >>>>>        available memory). When running on a machine with very
> >>>>>        little memory, the call to AZ_manage_memory(...) fails
> >>>>>        gracefully with the "maybe you should try a smaller problem"
> >>>>>        message.
> >>>>>
> >>>>>     -- The problem arises with both serial and parallel installations
> >>>>>        (i.e. when the code is compiled with and without mpi support)
> >>>>>        and with different trilinos releases.
> >>>>>
> >>>>>    -- The problem is difficult to isolate further since we use
> >>>>>        trilinos from within our own big library (which provides the
> >>>>>        preconditioner). Note that our code works fine if we use our
> >>>>>        own (serial) GMRES solver (or a direct solver).
> >>>>>
> >>>>>     Does any of this ring a bell?
> >>>>>
> >>>>>        Happy to run further tests here or provide additional
> >>>>>diagnostic
> >>>>> information.
> >>>>>
> >>>>>        Best wishes,
> >>>>>
> >>>>>                Matthias
> >>>>>
> >>>>> --
> >>>>>
> >>>>>
> >>>>>----------------------------------------------------------------------
> >>>>>--
> >>>>> --
> >>>>> -
> >>>>> Professor Matthias Heil
> >>>>>
> >>>>> Alan Turing Building, Room 2.224
> >>>>> School of Mathematics           Tel. +44 (0)161 275 5808
> >>>>> University of Manchester        Fax. +44 (0)161 275 5819
> >>>>> Oxford Road                     email: M.Heil at maths.man.ac.uk
> >>>>> Manchester M13 9PL              WWW:
> >>>>>http://www.maths.man.ac.uk/~mheil/
> >>>>> U.K.
> >>>>>
> >>>>> NEWS:   The beta release of oomph-lib, the object-oriented
> >>>>>           multi-physics finite-element library is now available
> >>>>>           as free open-source software at
> >>>>>
> >>>>>               http://www.oomph-lib.org
> >>>>>
> >>>>>
> >>>>>
> >>>>>----------------------------------------------------------------------
> >>>>>--
> >>>>> --
> >>>>> -
> >>>>>
> >>>>> _______________________________________________
> >>>>> Trilinos-Users mailing list
> >>>>> Trilinos-Users at software.sandia.gov
> >>>>> http://software.sandia.gov/mailman/listinfo/trilinos-users
> >>> --
> >>>
> >>>------------------------------------------------------------------------
> >>>--
> >>> -
> >>> Professor Matthias Heil
> >>>
> >>> Alan Turing Building, Room 2.224
> >>> School of Mathematics           Tel. +44 (0)161 275 5808
> >>> University of Manchester        Fax. +44 (0)161 275 5819
> >>> Oxford Road                     email: M.Heil at maths.man.ac.uk
> >>> Manchester M13 9PL              WWW:
> http://www.maths.man.ac.uk/~mheil/
> >>> U.K.
> >>>
> >>> NEWS:   The beta release of oomph-lib, the object-oriented
> >>>          multi-physics finite-element library is now available
> >>>          as free open-source software at
> >>>
> >>>              http://www.oomph-lib.org
> >>>
> >>>
> >>>------------------------------------------------------------------------
> >>>--
> >>> -
> >>>
> >>>
> >>>
> >
> >--
> >--------------------------------------------------------------------------
> >-
> >Professor Matthias Heil
> >
> >Alan Turing Building, Room 2.224
> >School of Mathematics           Tel. +44 (0)161 275 5808
> >University of Manchester        Fax. +44 (0)161 275 5819
> >Oxford Road                     email: M.Heil at maths.man.ac.uk
> >Manchester M13 9PL              WWW: http://www.maths.man.ac.uk/~mheil/
> >U.K.
> >
> >NEWS:   The beta release of oomph-lib, the object-oriented
> >         multi-physics finite-element library is now available
> >         as free open-source software at
> >
> >             http://www.oomph-lib.org
> >
> >--------------------------------------------------------------------------
> >-
> >
> >
> >
>
>
>
>
> ------------------------------
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
>
>
> End of Trilinos-Users Digest, Vol 101, Issue 9
> **********************************************
>



-- 

Dr. Riccardo Rossi, Civil Engineer

Member of Kratos Team

International Center for Numerical Methods in Engineering - CIMNE
Campus Norte, Edificio C1

c/ Gran Capitán s/n

08034 Barcelona, España

Tel:        (+34) 93 401 56 96

Fax:       (+34) 93.401.6517
web:       www.cimne.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://software.sandia.gov/pipermail/trilinos-users/attachments/20140122/bc2f2269/attachment-0001.html 


More information about the Trilinos-Users mailing list