[Trilinos-Users] Deterministic messaging?
Heroux, Michael A
maherou at sandia.gov
Wed Jun 15 09:12:56 MDT 2011
In my experience, MPI gives bit-wise identical collective results when run
on the same machine and same number of processors. This is not guaranteed
by the standard, but is a practical reality in my experience.
I don't recall seeing any exception to this. Although many algorithms are
used for collectives, some of them do in fact guarantee bitwise
reproducibilty, e.g., the Grey-coding algorithm.
Mike
On 6/15/11 10:01 AM, "Conjeepuram Subramanian, Natarajan"
<C.S.Natarajan at bp.com> wrote:
> Mike,
> Maybe I don't understand this issue correctly but wouldn't
> guaranteeing consistency mean floating point arithmetic is commutative?
> I wasn't aware one could guarantee that with MPI-2!
>
> Cheers,
> C.S.N
>
> -----Original Message-----
> From: trilinos-users-bounces at software.sandia.gov
> [mailto:trilinos-users-bounces at software.sandia.gov] On Behalf Of Heroux,
> Michael A
> Sent: Wednesday, June 15, 2011 9:50 AM
> To: Bartlett, Roscoe A; Willenbring, James M; John R Cary; Trilinos
> Users
> Subject: Re: [Trilinos-Users] Deterministic messaging?
>
> John,
>
> As long as you are running on the same number of MPI processes on the
> same
> machine, without threading (for example in Epetra), you should see
> bit-wise
> identical results from Trilinos.
>
> Mike
>
>
> On 6/15/11 9:30 AM, "Roscoe Bartlett" <rabartl at sandia.gov> wrote:
>
>> John,
>>
>> I thought that bit-wise reproducibility with singe thread-per-process
> MPI was
>> guaranteed on a homogeneous machine (and is given by the MPI
> implementation of
>> global reduction operations). On a heterogeneous machine I am not
> sure this
>> is true but you will have to talk to the MPI implementation people
> about this,
>> not Trilinos developers. As long as MPI race conditions don't exist,
> I don't
>> think this is a Trilinos problem.
>>
>> Moving to multi-core (multiple threads, etc.) changes all of this ...
>>
>> -Ross
>>
>>
>>> -----Original Message-----
>>> From: trilinos-users-bounces at software.sandia.gov [mailto:trilinos-
>>> users-bounces at software.sandia.gov] On Behalf Of Willenbring, James M
>>> Sent: Wednesday, June 15, 2011 7:42 AM
>>> To: John R Cary; Trilinos Users
>>> Subject: Re: [Trilinos-Users] Deterministic messaging?
>>>
>>> John,
>>>
>>> We have discussed this a few times at developer meetings. The
>>> consensus has been that bit-level reproducibility is prohibitively
>>> expensive. Are you able to modify the test to have some tolerance?
>>> This seems to be the most common way to avoid needing bit-level
>>> reproducibility.
>>>
>>> Jim
>>>
>>> -----Original Message-----
>>> From: trilinos-users-bounces at software.sandia.gov [mailto:trilinos-
>>> users-bounces at software.sandia.gov] On Behalf Of John R Cary
>>> Sent: Wednesday, June 15, 2011 6:33 AM
>>> To: Trilinos Users
>>> Subject: [Trilinos-Users] Deterministic messaging?
>>>
>>> We have a regression test that uses a trilinos solver.
>>> It seems to drift a bit (numerical errors) when run in
>>> parallel but not serial.
>>>
>>> Is the messaging when using trilinos solvers deterministic
>>> so that one can have bit-level reproducibility?
>>>
>>> Can it be made so?
>>>
>>> Thx....John
>>>
>>> _______________________________________________
>>> Trilinos-Users mailing list
>>> Trilinos-Users at software.sandia.gov
>>> http://software.sandia.gov/mailman/listinfo/trilinos-users
>>>
>>> _______________________________________________
>>> Trilinos-Users mailing list
>>> Trilinos-Users at software.sandia.gov
>>> http://software.sandia.gov/mailman/listinfo/trilinos-users
>>
>> _______________________________________________
>> Trilinos-Users mailing list
>> Trilinos-Users at software.sandia.gov
>> http://software.sandia.gov/mailman/listinfo/trilinos-users
>
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
>
>
More information about the Trilinos-Users
mailing list