[Trilinos-Users] [EXTERNAL] GPU Support for Sparse Direct Solvers KLU2 / Basker

Weber, Daniel daniel.weber at igd.fraunhofer.de
Tue Jun 11 03:53:42 EDT 2019

Dear Siva,

thank you for your prompt and detailed response.

I’m investigating and comparing GPU-accelerated direct sparse linear solvers. I know that GPU-acceleration for these classes of solvers is tough Currently, I’m checking which solvers are available and I was wondering if there is one in the Trilinos package. I was looking for something similar like the GPU-accelerated solvers of SuiteSparse, although they don’t have a Sparse LU solver, but optimized QR and Cholesky solvers. From what I understand, Tacho is not available yet and it is still an open question, if it will provide at least a decent speed up.

I will check the future releases of Trilinos, but of course I would be happy if your colleagues could also share some details.

Best regards,

From: Rajamanickam, Sivasankaran <srajama at sandia.gov>
Sent: Samstag, 8. Juni 2019 04:17
To: Weber, Daniel <daniel.weber at igd.fraunhofer.de>; trilinos-users at trilinos.org
Cc: Hollman, David S <dshollm at sandia.gov>; Kim, Kyungjoo (-EXP) <kyukim at sandia.gov>
Subject: Re: [EXTERNAL] [Trilinos-Users] GPU Support for Sparse Direct Solvers KLU2 / Basker


  a) Kokkos based does not automatically translate to GPU runs. We have not tried Basker on GPUs and don't expect it do well on GPUs as it is written now. KLU2 doesn't use Kokkos. It is just a templated version of KLU.

  b) Tacho, a sparse Cholesky solver is based on Kokkos tasking and can be run on GPUs. However, tasking on GPUs is a hard problem and the performance in Trilinos master is not that good. I have copied Kyungjoo Kim and David Hollman who are working on improving Kokkos tasking on GPUs and Tacho performance in GPUs. This is coming to Trilinos very soon (with the next Kokkos update). They can add more details.

  That said sparse direct solvers on GPUs are quite hard. What is your primary use case for requiring this ?



From: Trilinos-Users <trilinos-users-bounces at trilinos.org<mailto:trilinos-users-bounces at trilinos.org>> on behalf of Weber, Daniel <daniel.weber at igd.fraunhofer.de<mailto:daniel.weber at igd.fraunhofer.de>>
Sent: Friday, June 7, 2019 8:18 AM
To: trilinos-users at trilinos.org<mailto:trilinos-users at trilinos.org>
Subject: [EXTERNAL] [Trilinos-Users] GPU Support for Sparse Direct Solvers KLU2 / Basker


I’m currently trying to identify, if there a sparse direct solvers available within the Trilinos project. What I think I understood by studying documentation, tutorials, etc. :

-          there are abstractions for general sparse linear solvers (Stratimkos) and respective specializations for iterative (Belos) and direct solver (Amesos2)

-          Within Amesos2 external solvers can be used (e.g. SuperLU) or one of the two internal solvers KLU2 and Basker

-          KLU2 and Basker rely on Kokkos, which abstracts from the programming model (OpenMP, CUDA, etc.)

From this information I conclude that theoretically KLU2 or Basker might be configured / compiled with GPU acceleration (due to the Kokkos abstraction). However, I haven’t found any indications if a) this statement is true b) if it makes sense (e.g. the memory access pattern might result in poor performance) or c) how to set it up.

I really appreciate any kind of information, i.e. simple yes / no answers to a) and b), detailed answers or pointers to slides, documentation, videos etc. for a) –c).

Thank you, best regards,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://trilinos.org/pipermail/trilinos-users/attachments/20190611/0f4a5e58/attachment.html>

More information about the Trilinos-Users mailing list