[Trilinos-Users] [EXTERNAL] Threading for incomplete LU in Ifpack2

Wen Yan wenyan4work at gmail.com
Thu Dec 22 13:20:21 EST 2016


After a lot of trial and errors I found damped point Jacobi relaxation a
good MPI preconditioner with good multi-threading for me.

Thank you!

On Sat, Dec 17, 2016 at 11:04 AM, Rajamanickam, Sivasankaran <
srajama at sandia.gov> wrote:

> CC'ing the users as others can respond in the group as well.
>
>
> The recommended pattern for solvers/preconditioners
>
> initialize() - call when the graph changes
>
> compute() - call when the matrix values change
>
> apply() - when you apply the preconditioner
>
>
> In your case, the cost of initialize becomes important. The SGS use case
> handles this. The coloring in the initialize is also multithreaded and GPU
> capable. However, this is not a very common use case.
>
>
> I don't understand your last question.
>
>
> -Siva
>
>
> ------------------------------
> *From:* Wen Yan <wenyan4work at gmail.com>
> *Sent:* Tuesday, December 13, 2016 12:45 PM
> *To:* Rajamanickam, Sivasankaran
> *Subject:* Re: [EXTERNAL] [Trilinos-Users] Threading for incomplete LU in
> Ifpack2
>
> Currently I'm using a naive implementation. Every timestep generates a new
> preconditioner, with one initialize() and one compute() call. Because the
> mesh is adaptively refined at every timestep, the matrix dimension and
> sparsity pattern are both changing so I am not sure how to reuse the
> initialized preconditioner from last timestep.
>
> Is it necessary to use a geometric multigrid to keep the matrix structure
> constant to reuse the initialized matrix?
>
> Thanks,
> Wen
>
>
>
>
> On Tue, Dec 13, 2016 at 2:27 PM, Rajamanickam, Sivasankaran <
> srajama at sandia.gov> wrote:
>
>> Yes, SGS is symmetric Gauss-Seidel.
>>
>>
>> Making sure multithreading works with third party libraries is really
>> hard. Note that some solvers could use OpenMP and some could use PThreads.
>> It needs to match with how you configure Trilinos first.
>>
>>
>> Second, we (and lot of other solvers) assume the cost of initialize can
>> be amortized over multiple compute() calls. How many initialize() and
>> compute() calls are you doing ?
>>
>>
>> -Siva
>>
>>
>>
>> ------------------------------
>> *From:* Wen Yan <wenyan4work at gmail.com>
>> *Sent:* Tuesday, December 13, 2016 11:42 AM
>> *To:* Rajamanickam, Sivasankaran
>> *Subject:* Re: [EXTERNAL] [Trilinos-Users] Threading for incomplete LU
>> in Ifpack2
>>
>> Thanks for the explanation, Siva.
>>
>> I am solving a many body boundary element problem in matrix-free formula
>> with FMM and MPI. Does SGS mean symmetric Gauss-Seidel?
>>
>> Also, I successfully got the multithreading SuperLU_DIST working in
>> Ifpack2 through the Amesos2 interface. The compute() stage is threaded but
>> the initialize() stage is not, and it takes very long time to initialize.
>> Therefore the threading in compute() phase is no longer useful. Is it an
>> expected behavior?
>>
>> Thanks,
>> Wen
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Dec 13, 2016 at 1:21 PM, Rajamanickam, Sivasankaran <
>> srajama at sandia.gov> wrote:
>>
>>> Wen Yan,
>>>
>>>    The ILU(k) multithreading is being planned for later this year.
>>> ILU(t) multithreading is trickier.
>>>
>>>
>>>    If you want to use a multithreaded local solve right away, you
>>> could try SGS now. There is an experimental capability available now. What
>>> is the problem you are trying to solve ?
>>>
>>>
>>> -Siva
>>>
>>>
>>> ------------------------------
>>> *From:* Trilinos-Users <trilinos-users-bounces at trilinos.org> on behalf
>>> of Wen Yan <wenyan4work at gmail.com>
>>> *Sent:* Tuesday, December 13, 2016 9:32 AM
>>> *To:* trilinos-users at trilinos.org
>>> *Subject:* [EXTERNAL] [Trilinos-Users] Threading for incomplete LU in
>>> Ifpack2
>>>
>>> Hi Trilinos Users,
>>>
>>> I was wondering if the RILUK and ILUT methods in Ifpack2 support
>>> multithreading. The online doxygen webpage for Ifpack2 says " Finally,
>>> Ifpack2
>>> <https://trilinos.org/docs/dev/packages/ifpack2/doc/html/namespaceIfpack2.html>'s
>>> algorithms use and produce Tpetra objects, so you can exploit Tpetra's
>>> hybrid (MPI + threads) parallelism features without effort."
>>>
>>> However I see that RILUK and ILUT both use only one CPU core in the
>>> compute() phase for my test program with a sparse matrix dimension at
>>> around 10^4~10^5. The nnz of entries per line is around 10%. I have
>>> compiled trilinos with openmp support and enabled openmp by setting
>>> OMP_NUM_THREADS. Multithreading is confirmed in other packages in test
>>> programs in 'make test'.
>>>
>>> Is it possible to activate multithreading RILUK and ILUT in Ifpack2? Or
>>> is it possible to call SuperLU_MT or SuperLU_DIST through the Amesos2
>>> interface?
>>>
>>> Thank you,
>>> Wen Yan
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://trilinos.org/pipermail/trilinos-users/attachments/20161222/30d7cb4a/attachment.html>


More information about the Trilinos-Users mailing list