[Trilinos-Users] [EXTERNAL] How to enable parallelization in sparse matrix multiplication with multiple threads

Rajamanickam, Sivasankaran (-EXP) srajama at sandia.gov
Thu Aug 6 11:51:06 EDT 2015

  The sparse MMM in Trilinos doesn't use node level parallelism. It is in our list of things to get done in the near future.


From: Trilinos-Users <trilinos-users-bounces at trilinos.org> on behalf of Zheng Da <zhengda1936 at gmail.com>
Sent: Thursday, August 6, 2015 9:40 AM
To: trilinos-users at trilinos.org
Subject: [EXTERNAL] [Trilinos-Users] How to enable parallelization in sparse matrix multiplication with multiple threads


I notice that Kokkos can parallelize sparse matrix multiplication with
OpenMP. My assumption is that if I pass "-D
Trilinos_ENABLE_OpenMP:BOOL=ON" to cmake, sparse matrix multiplication
should be parallelized by OpenMP. I compiled and installed Trilinos
with Trilinos_ENABLE_OpenMP enabled and compiled my code with it, but
I didn't see any parallelization. `top' shows the code used only one
CPU core all the time. My code is performing sparse matrix
multiplication with Tpetra::CrsMatrix on a sparse square matrix with
roughly 4 million rows and columns.

I compile my code with g++ as follow:
g++ -c test-tpetra_multiply.cpp -DKOKKOS_HAVE_OPENMP -g -O3 -I. -Wall
-fPIC -std=c++0x  -Wno-attributes -fopenmp
g++ -o test-tpetra_multiply test-tpetra_multiply.o -lpthread
-rdynamic -fopenmp -lcblas -lprofiler -ltpetra -ltpetrakernels
-lteuchoscomm -lteuchoskokkoscomm -lteuchosnumerics
-lteuchosparameterlist -lteuchoscore -lanasazi -lkokkoscore
-lkokkoscontainers -lteuchoskokkoscompat

Do I need to enable some special flags or link to some parallel Kokkos

I also looked into the code a little and found the OpenMP code in
Kokkos is controlled by KOKKOS_HAVE_OPENMP. So when I compiled my own
code, I also enabled KOKKOS_HAVE_OPENMP, but it still doesn't help.

Another thing I checked is that when I compiled Trilinos, I asked make
to print all g++ commands. I saw -fopenmp was enabled but I didn't see
-DKOKKOS_HAVE_OPENMP at all. I'm not certain the OpenMP code in Kokkos
has been compiled to Trilinos libraries.

Could anyone tell me how exactly I should enable multi-threading
paralellization in Tpetra? or how should I debug this problem?
I'm using Trilinos 12.0.1.

Thank you in advance,
Trilinos-Users mailing list
Trilinos-Users at trilinos.org

More information about the Trilinos-Users mailing list