[Trilinos-Users] ThreadSafe parallel assembly
rrossi at cimne.upc.edu
Thu Apr 17 07:21:03 MDT 2014
first of all thx for your answer.
indeed the code is minimally intrusive and it is worth trying it out.
from my past experience however the scalability of such solution is really
low. We tried it out several years ago to do openmp assembly within ublas
(an otherwise scalar library) and the results were a little bit deceiving.
we got much better results by protecting the whole row at once via an
OpenMP lock (in our code we store one such semaphore per each local row of
the matrix), this way only one thread at once can access to a given line of
the matrix, but then all of the writing is done without further problems
(in principle it is very rare that a lot of processes write on the same
line... but if they do then they'll conflict a lot on that line)
Admittedly this solution was designed when OpenMP was "younger" (gcc 4.2)
and i guess that atomic implementations improved since then.
I may redo some of the benchmarks on a toy problem and report to you if i
found anything interesting
greetings and happy easter
On Thu, Apr 17, 2014 at 12:47 PM, Heroux, Michael A <maherou at sandia.gov>wrote:
> The code you are seeing was developed by a student working on the LifeV
> project. It works well in specific contexts. Below is his response to me,
> that I am forwarding to you.
> I hope it is helpful.
> The changes to the matrix classes in Epetra were minimal. I made them
> inside EPETRA_HAVE_OMP_NONASSOCIATIVE blocks, as a safety issue, but it's
> not really related to non associative operations.
> Change number one was that for local row insertions in Epetra CrsMatrix I
> ensured thread safety with Openmp atomic operations. If the client calls
> SumIntoGlobalValues inside an Openmp parallel region, then the operation is
> fast and safe.
> Number two is that if overlapping maps are used for the matrix (or
> off-process indices are discarded in the constructor) then off process
> contributions to the indices are completely ignored - previously, they were
> still collected but then discarded "at the last minute". If overlapping
> maps are not used, the operations are still made thread safe with an Openmp
> critical region, but this is not very fast.
>> *From:* trilinos-users-bounces at software.sandia.gov [
>> trilinos-users-bounces at software.sandia.gov] on behalf of Riccardo Rossi [
>> rrossi at cimne.upc.edu]
>> *Sent:* Sunday, April 13, 2014 12:22 PM
>> *To:* trilinos-users at software.sandia.gov
>> *Subject:* [EXTERNAL] [Trilinos-Users] ThreadSafe parallel assembly
>> Dear list,
>> i have been looking at the code in the EpetraCRS matrix and found
>> that if "EPETRA_HAVE_OMP_NONASSOCIATIVE" is defined
>> the function "SumIntoGlobalValues" shall be threadsafe.
>> does it exist any documentation of this? any other interesting option?
>> thx in advance
PhD, Civil Engineer
member of the Kratos Team: www.cimne.com/kratos
lecturer at Universitat Politècnica de Catalunya, BarcelonaTech (UPC)
Research fellow at International Center for Numerical Methods in
C/ Gran Capità, s/n, Campus Nord UPC, Ed. C1, Despatx C9
08034 – Barcelona – Spain – www.cimne.com -
T.(+34) 93 401 56 96 skype: *rougered4*
Les dades personals contingudes en aquest missatge són tractades amb la
finalitat de mantenir el contacte professional entre CIMNE i voste. Podra
exercir els drets d'accés, rectificació, cancel·lació i oposició,
dirigint-se a cimne at cimne.upc.edu. La utilització de la seva adreça de
correu electronic per part de CIMNE queda subjecte a les disposicions de la
Llei 34/2002, de Serveis de la Societat de la Informació i el Comerç
Imprimiu aquest missatge, només si és estrictament necessari.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Trilinos-Users