[Trilinos-Users] Floating-point arithmetic is non-associative

Conjeepuram Subramanian, Natarajan C.S.Natarajan at bp.com
Wed Jun 15 10:53:00 MDT 2011


Mike, Mark,
       Thanks for clarifying. 
Mike, thanks for pointing out that you have always seen bit-wise
identical collective results. I have had some experience with slightly
varying results on the same number of procs, nothing alarming though,
but I think I'll go revisit them again. 

Mark, I understand your logic about FP arithmetic not being associative
and that being the reason for different answers but I am still a little
unsure how you can guarantee it to be commutative?

I am going to be an ignoramus and ask a naive question.
Say a reduce with 3 procs, predefined op. as product and values a(p0),
b(p1), c(p2). a, c very large and b very small (All floats with ac >
max(float) and ab and bc <<< max(float)). I would think this would work
and give correct results only if a, b, and c are in procs 0, 1, and 2 or
some good combination thereof where the prod. ac is guaranteed to be not
computed before ab/bc!

Is this not correct? As I know all predefined operations are assumed to
be both associative and commutative and you can turn of commutative by
creating your own op. but not associative. so in this case would the
previous example be considered a case of non-associative behaviour?

Cheers,
C.S.N

-----Original Message-----
From: trilinos-users-bounces at software.sandia.gov
[mailto:trilinos-users-bounces at software.sandia.gov] On Behalf Of
Hoemmen, Mark
Sent: Wednesday, June 15, 2011 10:55 AM
To: trilinos-users at software.sandia.gov
Subject: [Trilinos-Users] Floating-point arithmetic is non-associative

On 6/15/11 10:01 AM, "Conjeepuram Subramanian, Natarajan"
<C.S.Natarajan at bp.com> wrote:
> Maybe I don't understand this issue correctly but wouldn't
> guaranteeing consistency mean floating point arithmetic is
commutative?

In the previous message string, please replace "commutative" with
"associative."  (Non-associativity of floating-point arithmetic is the
cause of different results when varying the number of processors.)
Floating-point addition and multiplication are commutative: a + b = b +
a, and a*b = b*a.  

Associativity relates to reduction trees: a reduction tree is a complete
parenthesization of a sum.  Changing the shape of the reduction tree
changes the placement of parentheses in the sum.  Other than completely
serializing all reductions, the only way to guarantee bitwise repeatable
results is to use so-called "distillation" algorithms.  I have not seen
any parallel implementations of those.  They would be quite slow in any
case.  A faster way to make bitwise identical results more likely (not
guaranteed!) is to compute all sums in extended-precision arithmetic.  

mfh


_______________________________________________
Trilinos-Users mailing list
Trilinos-Users at software.sandia.gov
http://software.sandia.gov/mailman/listinfo/trilinos-users





More information about the Trilinos-Users mailing list