[Trilinos-Users] Test fails for MUMPS

Deparis Simone simone.deparis at gmail.com
Sun May 4 08:32:59 MDT 2014


Complements:
Using intelmpi on the same version does not show the same problem. The test runs like a charm.
(But I am having other tests failing).

In parallel I am going to check with the system admins that MUMPS has being compiled with the correct MPI version (namely openmpi and intelmpi respectively).

Best regards
Simone


On 04 May 2014, at 16:23, Deparis Simone <deparis.simone at gmail.com> wrote:

> Dear developers and users,
> 
> I am running the trillions tests with MUMPS enabled and test Amesos_Test_MUMPS.exe fails when using more than one MPI process.
> Does anyone have the same problem?
> 
> My configuration:
> trilinos-git, tag  trilinos-release-11-8-1
> MUMPS 4.10.0
> intel compilers 13.0.1
> openmpi 1.6.3 
> 
> Symptoms:
> ========
> deparis at b107:Test_MUMPS > mpirun -np 1 ./Amesos_Test_MUMPS.exe
> ||Ax - b||  = 5.19196e-12
> ||Ax - b||  = 1.41325e-11
> 
> deparis at b107:Test_MUMPS > mpirun -np 4 ./Amesos_Test_MUMPS.exe
> [b107:65729] *** An error occurred in MPI_Waitany
> [b107:65729] *** on communicator MPI_COMM_WORLD
> [b107:65729] *** MPI_ERR_REQUEST: invalid request
> [b107:65729] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 65729 on
> node b107 exiting improperly. There are two reasons this could occur:
> 
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
> 
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
> 
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> —————————————————————————————————————
> 
> 
> Where:
> =====
> Amemos_MUMPS.cpp, line 502:
> dmumps_c(&(MDS));   //  Initialize MUMPS                                            
> 
> My Guess:
> =========
> dmumps_c is calling MPI_INIT behind the scene.
> 
> Does someone have the same problem? Can you confirm the guess or I wrong?
> 
> Thank you
> Best regards
> Simone
> 
> 



More information about the Trilinos-Users mailing list