[Trilinos-Users] Sacado reverse mode

Phipps, Eric T etphipp at sandia.gov
Tue Mar 3 14:41:07 MST 2009


Hi Nik,

You are exactly correct.  With Sacado/Rad, you must generate a new buffer each time you want to evaluate a gradient at a new point, and so you don't need to worry about branch switching.  You can also print out the values of the intermediate variables just as you described for debugging.  Note however that Rad still does create a memory buffer, so debugging the reverse sweep can be difficult (although if we have done our job right you won't need to do that).  The "taping" process in Rad is dramatically more efficient than ADOL-C, and my experience has been that the combined cost of generating a new Rad buffer in the forward evaluation followed by a reverse sweep is roughly the same cost as re-using an existing tape for a forward then reverse sweep with ADOL-C (the actual cost depends a lot on the details of the function you are differentiating).

I agree with you whole-heartedly that the branch switching facilities in ADOL-C are extremely cumbersome and are only useful where there are very few switches.  That is partly why Dave wrote Rad and designed it like he did.  I will caution you however that ADOL-C is a very mature package, whereas Sacado isn't as mature.  I am happy to help out however with any difficulties you encounter using it though.

One difficulty you may encounter with Rad in particular has to do with passive variables, which are variables which are declared to be of the type of AD variables (ADvar<double> in Rad's case) but are only initialized once throughout your computation.  Part of what makes the taping process in Rad so efficient is that it recycles its memory from one taping to the next.  That means if you have a passive variable that was initialized in the first taping but not in subsequent tapings, its value will be corrupted when its memory is recycled.  If you have a variable like this, you need to tell Rad it is a constant using the AD_Const() function, for example,

ADvar<double> a = 1.0;
AD_Const(a);

The README_RAD file describes these kinds of problems and solutions for them in more detail.

-Eric

On 3/3/09 9:27 AM, "Nikhil Kriplani" <nkriplani at gmail.com> wrote:

Hi Eric,

Thanks for the reply. In our usage of ADOL-C, we have found that some
conditionals are either too difficult or cumbersome to convert to a
form such that every branch is taped. We are working on an electronic
system simulator and some semiconductor device models have pretty
convoluted if/else if/else constructs. Technically, it is possible to
reformulate this, but is often far from trivial. As a result, ADOL-C
will recreate the tape ... from our end we have to retape, i.e.
monitor whether a tape recreation is necessary (ADOL-C provides a
function for this) and then redo all the automatic differentiation
steps.

The main trouble we have had with the tape system is that it is
impossible to debug code. In a simple case assuming there is no
retaping, if a piece of code is called several times and if I have a
std::cout statement in there to monitor the state of an overloaded
double variable, I can only see its value at step zero because the
tape "takes over" after that and I can't access the tape.

I was wondering if that would be an issue with the reverse mode in
Sacado ... if I understand your previous response correctly, Sacado
evaluates the functional code, in my case the semiconductor device
model code, every time the code is called (with the intermediate
partials stored efficiently). So I do not have to worry about what
will happen when branch switching occurs and I could debug easily by
monitoring the value of some variable (say, var1) by having std::cout
<< var1.val() in my code to print out the value every time the code is
called. Does this sound correct?

If this is so, then I also would not need a separate retape function ...

--Nik

On Mon, Mar 2, 2009 at 11:54 AM, Phipps, Eric T <etphipp at sandia.gov> wrote:
> Hi Nik,
>
> The reverse mode in Sacado (called RAD) does use a temporary buffer that is
> somewhat similar to ADOL-C, but not exactly the same.  In the forward
> evaluation, RAD creates a temporary buffer that stores the value and partial
> derivatives of each intermediate operation that are accumulated during the
> reverse sweep.  ADOL-C on the other hand stores values and a functional
> representation of each operation.  The operation partials are then only
> implicitly generated and used during the reverse accumulation.  Since RAD
> stores the partials up-front, there is less interpretation of the temporary
> buffer/tape as compared to ADOL-C leading to a generally more efficient
> derivative computation.  Note however that RAD doesn't provide any
> capability for re-evaluating this temporary buffer at a new set of
> independent values (this isn't possible since RAD doesn't store functional
> information), nor any branch switching capabilities.  Instead you should
> re-create the buffer when evaluating at a new point, and RAD makes use of a
> custom memory management scheme to make this efficient.
>
> As far as I understand, ADOL-C can't in fact recreate a tape for you, rather
> if there are branches in the code, you replace the conditionals with a
> special ADOL-C function that represents the branch in the tape.  Then the
> appropriate branch will be evaluated from the tape depending on the value of
> the conditional (essentially all possible paths through the code are taped).
>  This only works if all of your branches through the code can be evaluated
> together and have no conflicts.
>
> Unfortunately the documentation for Sacado is lacking as you pointed out.
>  That is something we hope to rectify in the future.  The README_RAD file in
> sacado/src describes some usage tips for RAD.  More information about RAD is
> also provided in a paper available on Dave Gay's website:
> http://www.cs.sandia.gov/~dmgay/ad04_paper.pdf.
>
> I hope this helps.  Please let us know if you have any further questions.
>
> -Eric
>
>
> On 2/27/09 4:35 PM, "Nikhil Kriplani" <nkriplani at gmail.com> wrote:
>
> Hi,
>
> About the reverse mode in Sacado: Does it use a temporary buffer or
> tape like some AD packages (like ADOL-C) do when operating in reverse
> mode or does it perform code evaluations everytime the function is
> called, for example in a loop? I am somewhat familiar with ADOL-C and
> I know that when there is a branch switch detected in ADOL-C, the tape
> is recreated.
>
> I couldn't find any documentation on Sacado apart from looking at code
> examples. Is there a place I can get some documentation and usage
> tips?
>
> Thanks,
> Nik
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
>
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://software.sandia.gov/mailman/private/trilinos-users/attachments/20090303/5caec0a8/attachment.html 


More information about the Trilinos-Users mailing list