[Trilinos-Users] about Tpetra + Kokkos app built against installed trilinos
pierre.kestener at cea.fr
Wed Jan 4 06:13:02 EST 2017
Following up my previous message, I am realizing that setting option
passed to cmake in a trilinos configure script has no effect
Whatever the flags passed here, they are never transmitted to nvcc_wrapper later.
Even though, you correctly set CUDA_NVCC_FLAGS (correct cuda architecture, ...),
if you build Kokkos examples through Trilinos/cmake/tribits and then run on a GPU with an architecture different from sm_35 (default of nvcc_wrapper), you will get a warning like this
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 3.5 on device with compute capability 5.0 , this will likely reduce potential performance
Since the flags are not taken into account, nvcc_wrapper uses default values, here architecture sm_35.
Maybe someone could confirm, there is a problem in the way cmake/tribits passes flags down to nvcc_wrapper.
As a suggestion, i would propose to slightly modify nvcc_wrapper:
- if env variable CUDA_NVCC_FLAGS is set, just parse it and use it to build the nvcc_command variable.
That way, I think the change would minimal to make sure the correct flags are passed to nvcc_wrapper
De : KESTENER Pierre
Envoyé : mardi 3 janvier 2017 23:41
À : trilinos-users at trilinos.org
Cc : KESTENER Pierre
Objet : about Tpetra + Kokkos app built against installed trilinos
The following is a suggestion for a minor improvement in the installed TrilinosConfig.cmake.
On the trilinos tutorial wiki
in the section "Learn how to create and use Kokkos with Tpetra."
the suggested CMakeLists.txt's are mostly OK, but if Trilinos has been built with Tpetra+Kokkos and CUDA enabled, i.e. the variable CUDA_NVCC_FLAGS was passed to the main trilinos configure script, activating some nvcc flags
passed to nvcc_wrapper (cuda architecture, uvm, lambda, ...), the precise flags used to build trilinos are not saved
in the installed TrilinosConfig.cmake
I think it would be useful to have those precise flags saved in a variable named something like
it would make easier afterwards to build a CUDA app against an installed trilinos.
More information about the Trilinos-Users