[Trilinos-Users] [EXTERNAL] GMRES inconsistent when using multiple threads (Belos+Tpetra, since 12.12.1)
Thornquist, Heidi K
hkthorn at sandia.gov
Thu Nov 9 14:52:47 EST 2017
Hi Christopher,
That is odd behavior. Does your driver use the BelosBlockGmresSolMgr or BelosPseudoBlockGmresSolMgr? If you don’t mind sharing your solver driver, that would be helpful too!
Many thanks!
Heidi
On 11/9/17, 11:28 AM, "Trilinos-Users on behalf of Christopher Thiele" <trilinos-users-bounces at trilinos.org on behalf of ct37 at rice.edu> wrote:
Hello,
I noticed some strange behavior in Belos' GMRES solver when using Tpetra
and multiple threads. I wrote a little tool (solver_driver) that just
reads in a matrix market file and solves a linear system to reproduce
it. Here are some results with the rma10 matrix from the University of
Florida collection (the choice is rather arbitrary):
./solver_driver_12.10.1 --threads=2 --matrix=rma10.mtx --rhs=rma10_b.mtx
--tol=1.0e-1 --solver=GMRES --disable-preconditioner
---> Converges in just 3 iterations, independent of the number of
threads etc. This seems reasonable given the tolerance of 1e-1.
./solver_driver_12.12.1 --threads=2 --matrix=rma10.mtx --rhs=rma10_b.mtx
--tol=1.0e-1 --solver=GMRES --disable-preconditioner
---> Converges in either 61 or 62 iterations, but the final residual
changes in the order of 1e-4 with each run.
OMP_PROC_BIND=true ./solver_driver_12.12.1 --threads=2
--matrix=rma10.mtx --rhs=rma10_b.mtx --tol=1.0e-1 --solver=GMRES
--disable-preconditioner
---> Converges in 121 iterations, and the final residual changes in the
order of 1e-3.
mpirun -n 1 ./solver_driver_12.12.1 --threads=2 --matrix=rma10.mtx
--rhs=rma10_b.mtx --tol=1.0e-1 --solver=GMRES --disable-preconditioner
---> Seems to solve the problem, consistently reproduces the result
obtained with 12.10.1
I also tested the current master branch from GitHub, and it shows the
same behavior as the 12.12.1 release. The problem does not occur with
other solvers like BiCGStab or CG. I added some information about my
system and the Trilinos configuration below. I can also share the code
(~200 loc), but it is really just a driver for the solvers.
Best regards,
Christopher
System:
OS: RHEL 7 (x84_64)
Compiler: GCC 4.8.5
MPI: Open MPI 1.10.6
CPU: Xeon E3-1220 V2 (4 cores)
Memory: 8 GB
Trilinos configuration script:
#!/bin/bash
set -e
SOURCE_BASE=/path/to/source
INSTALL_DIR=/path/to/installation
TRILINOS_PKG_CONF=" \
-DTrilinos_ENABLE_Teuchos=ON \
-DTrilinos_ENABLE_TeuchosParser=ON \
-DTrilinos_ENABLE_Belos=ON \
-DTrilinos_ENABLE_Ifpack2=ON \
-DTrilinos_ENABLE_MueLu=OFF \
-DTrilinos_ENABLE_Tpetra=ON \
-DTrilinos_ENABLE_Teko=OFF \
-DTrilinos_ENABLE_Zoltan2=OFF \
-DTrilinos_ASSERT_MISSING_PACKAGES=OFF \
"
TP_PKG_CONF=" \
-DTPL_ENABLE_MPI=ON \
-DTPL_ENABLE_CUDA=OFF \
-DTPL_ENABLE_MKL=ON \
-DTPL_MKL_LIBRARIES=/path/to/mkl/lib/intel64/libmkl_rt.so \
-DTPL_MKL_INCLUDE_DIRS=/path/to/mkl/include \
"
cmake \
-D MPI_C_COMPILER=mpicc \
-D MPI_CXX_COMPILER=mpicxx \
-D MPI_Fortran_COMPILER=mpifort \
-D CMAKE_C_COMPILER=mpicc \
-D CMAKE_CXX_COMPILER=mpicxx \
-D CMAKE_Fortran_COMPILER=mpifort \
-D CMAKE_CXX_FLAGS="-g -O3 -march=native" \
-D CMAKE_BUILD_TYPE=RELEASE \
-D Trilinos_ENABLE_TESTS=OFF \
-D Trilinos_ENABLE_EXPLICIT_INSTANTIATION=ON \
-D Trilinos_CXX11_FLAGS="-std=c++11" \
-D Kokkos_ENABLE_Pthread=OFF \
-D Kokkos_ENABLE_OpenMP=ON \
-D Kokkos_ENABLE_Cuda=OFF \
-D Kokkos_ENABLE_Cuda_UVM=OFF \
-D Trilinos_ENABLE_OpenMP=ON \
-D Tpetra_INST_SERIAL=ON \
-D Tpetra_INST_CUDA=OFF \
-D Tpetra_INST_OPENMP=ON \
-D Tpetra_INST_INT_INT=ON \
-D Tpetra_INST_INT_LONG=OFF \
-D Tpetra_INST_COMPLEX_DOUBLE=OFF \
-D Teuchos_ENABLE_LONG_LONG_INT=OFF \
-D CMAKE_INSTALL_PREFIX=${INSTALL_DIR} \
${TRILINOS_PKG_CONF} \
${TP_PKG_CONF} \
${SOURCE_BASE}
_______________________________________________
Trilinos-Users mailing list
Trilinos-Users at trilinos.org
https://trilinos.org/mailman/listinfo/trilinos-users
More information about the Trilinos-Users
mailing list