[Trilinos-Users] How to enable parallelization in sparse matrix multiplication with multiple threads
zhengda1936 at gmail.com
Thu Aug 6 11:40:53 EDT 2015
I notice that Kokkos can parallelize sparse matrix multiplication with
OpenMP. My assumption is that if I pass "-D
Trilinos_ENABLE_OpenMP:BOOL=ON" to cmake, sparse matrix multiplication
should be parallelized by OpenMP. I compiled and installed Trilinos
with Trilinos_ENABLE_OpenMP enabled and compiled my code with it, but
I didn't see any parallelization. `top' shows the code used only one
CPU core all the time. My code is performing sparse matrix
multiplication with Tpetra::CrsMatrix on a sparse square matrix with
roughly 4 million rows and columns.
I compile my code with g++ as follow:
g++ -c test-tpetra_multiply.cpp -DKOKKOS_HAVE_OPENMP -g -O3 -I. -Wall
-fPIC -std=c++0x -Wno-attributes -fopenmp
g++ -o test-tpetra_multiply test-tpetra_multiply.o -lpthread
-rdynamic -fopenmp -lcblas -lprofiler -ltpetra -ltpetrakernels
-lteuchoscomm -lteuchoskokkoscomm -lteuchosnumerics
-lteuchosparameterlist -lteuchoscore -lanasazi -lkokkoscore
Do I need to enable some special flags or link to some parallel Kokkos
I also looked into the code a little and found the OpenMP code in
Kokkos is controlled by KOKKOS_HAVE_OPENMP. So when I compiled my own
code, I also enabled KOKKOS_HAVE_OPENMP, but it still doesn't help.
Another thing I checked is that when I compiled Trilinos, I asked make
to print all g++ commands. I saw -fopenmp was enabled but I didn't see
-DKOKKOS_HAVE_OPENMP at all. I'm not certain the OpenMP code in Kokkos
has been compiled to Trilinos libraries.
Could anyone tell me how exactly I should enable multi-threading
paralellization in Tpetra? or how should I debug this problem?
I'm using Trilinos 12.0.1.
Thank you in advance,
More information about the Trilinos-Users