[Trilinos-Users] XYCE math formulation article

Jon Engelbert jon at beigebag.com
Thu Jul 19 11:23:20 MDT 2007

Eric (and others),
First, let me express my admiration for your XYCE team... I've been working
with Spice for 15 years now, and I know first-hand how difficult it is.  To
be successful takes many skills, including software, math, circuits, and
device models, and a hell of a lot of persistence.
OK, enough fawning...

I suspected that our strategy for solving for "dx" was OK because we
implemented that as a new option for the spice engine in B2 Spice, and I
tested many circuits with both solving for X and solving for delta X.  The
simulations generally were within a very fine difference (e.g. .1%) with the
default tolerances, and some circuits even converged when solving for delta
X that didn't converge when solving for X.  So changing the equation around
was worth the effort.  If you want more details, I could send you some of
these circuits.  I assume that this is due to the numerical effects of
fixed/limited precision and the greater precision in the solution vector
when it is "Delta X" and converges to zero than when it is "X".   Again,
this is with the default sparse matrix functionality (I think it's sparse
1.3) in berkeley spice 3.

Also, I don't know if you have a good test suite of circuits.  If you do,
please let me know about them.  I used the NCSU suite of circuits from 1990,
CircuitSim90.  Also, we have many circuits that our customers have sent in
over the years, but I can't share those.  Have you tested xyce with
circuitsim90?  If so, then how many ran to completion and gave the correct

When testing with CircuitSim90, I was surprised to find that the default
berkeley spice simulator chokes on many of them... probably 30%.  I found
some mundane errors in the mos1 & mos2 models that were easy to fix and
helped some circuits to converge, and switching so that SPARSE solves for
"DX" instead of "X" helped with a few others, but there are still many (e.g.
25%) that fail.  I don't know how many of these failures are due to
discontinuities in the mos models, but assuming that the MOS models are not
to blame, then the 'problem' is in the numerical methods.  I don't remember
how many failed to converge on an operating point, but of those that did,
the failures are probably due to limitations of the dc solver or the
non-linear algorithm (niiter/ sparse) that calls it.  The XYCE documents on
the web seem to indicate that this is the case.  Also, I don't recall how
many of these errored out in the transient solver, but of those that did, it
could be due to limitations of the transient
(time-step/integration/truncation-error) code, or limitations of the
non-linear solver.

Now that I have your ear, I'm going to rattle off a bunch of questions... If
you don't have time or interest in answering them, I'll understand, but I
would really appreciate your feedback.

Bias point:
>From the XYCE news on the web site, I learned that you switched originally
to KSparse from Spice's Sparse module, and then to KLU.
It sounds like you totally replaced KSparse with KLU, and we're going to
follow your example.
Neal is trying to implement a nonlinear solver for our spice engine using
NOX & KLU using Amesos.  He hasn't found any examples showing how to do
this, and he's been struggling for many hours with it.  If you can give him
any advice, I'm sure that he'd appreciate it.
I am also wondering what options we should use in NOX... i.e., there's trust
region based and tensor based and newton.
I think that I read in the math article that voltage limiting prevents some
of these strategies from working.  Can we make use of strategies other than
Newton if we tell NOX not to use the solutions when voltage limiting is
being used?
Forgive my ignorance if these questions make no sense... As you know, I'm a
novice with numerical methodes.  Neal has been looking into this more than I
have, but I can tell that his one undergrad class in numerical methods
didn't prepare him for this.

Lambert W-Function (and expl):
The Xyce news (from version 1.1) also mentions that XYCE uses the Lambert
W-function... I think that I read that you use it as a replacement for the
exponential in the diode (and other pn junctions?)  Is this something that I
should look into?  I did some brief reading just now on this function, but I
don't see how it will help.  Regarding exponentials, there's also the
expl(x,limit) function, which behaves like an exponential up to the limit
then becomes a linear function with the slope of exp(limit).  I can see how
this would potentially help the simulator with diodes.  Can you tell me any
more about how you use Lambert-W and why?  And did your changes using the W
function have a big impact on simulator performance?

Transient considerations:
Also, we've been taking circuits from Christophe Basso SMPS book (version 2)
and trying to get them to work in B2 Spice.  None of them failed for the
initial state, but some of them had problems with the transient (e.g.
current mode open-loop buck).  In fact, we've had many cases over the years
of circuits that fail in the transient simulation.  We can usually get them
to work if we play around with them enough, e.g. cutting down the transient
time step to a crawl, and switching to gear method, but for a typical user,
they wouldn't know the right tricks to use and it wastes time, and sometimes
the simulation takes forever to run, and this is obviously not desirable.
This makes me think that we should enhance elements of the transient
'engine', i.e. the time stepping algorithm, or the
integration/truncation-error code.  I've read in the XYCE news that your
group has made good progress in this area.
I've read about TR-BDF2 being superior to other integration methods (e.g.
spice's trapezoidal 'gear'... which have several 'orders', and relate to
backward euler, &/or forward euler... I'm not an expert on these).  But I
haven't implemented TR-BDF2 yet.  Also, I've thought about storing 'good'
states every so often, and detecting when the transient simulation is in
serious trouble, and then backing up to the good states and reducing the
time-step or altering other parameters at that point, then continuing from
that point in time.
Is there code in Trilinos to help with any of this stuff?  If not, can you
steer me in fruitful directions?

By the way, even PSpice has trouble with some of Basso's power-related
circuits.  They have behavioral models within them, e.g. nonlinear
controlled sources with if-then-else statements and tables of data and
max/min functions... elements that create discontinuities.  This may be a
source of the problems when simulating the circuits.  Which brings up the
topic of discontinuities and rapidly changing voltages caused by
user-defined behavioral sources, and how to deal with those.  I'll stop

If you can give any advice to Neal & myself and others to give us direction,
I'd appreciate it.  I.e. pointing us to research articles that will help us
understand the issues better... or advice on which modules in trilinos will
help and how to use them.

Also, if you or any of the other xyce developers are interested in a copy of
B2 Spice, let me know and I'll send you download instructions and a free
full license.  And if you ever want to collaborate on code related to
schematics, the simulator (I have a lot of experience with the spice engine
even though I'm relatively ignorant on the numerical methods),
post-processing, or interactive results display such as virtual instruments,
let me know.

Thanks again, and sorry if I'm asking for too much help.  Please feel free
to be direct with me if I am... I realize that I'm asking a lot of
questions, and they might not be things that you can reply to quickly.

Jon Engelbert
primary author of B2 Spice A/D
734 332-0487
jon at beigebag.com, www.beigebag.com

-----Original Message-----
From: Eric Keiter [mailto:erkeite at sandia.gov]
Sent: Wednesday, July 18, 2007 4:32 PM
To: Neal Parsons; Rob Hoeksta
Cc: trilinos-users at software.sandia.gov; jon at beigebag.com; Mike Heroux;
Randall Bramley; Eric Keiter
Subject: Re: [Trilinos-Users] XYCE math formulation article

Hi Neal,

I think I received an email from your boss a few months ago.  Sorry I didn't
reply - I was on travel at the time.

Anyway, I'm the original author of the math document.  The thing you are
doing in SPICE to produce the J*dx = -f system should work, as long as the
J*x you are subtracting is based on the correct x.  I actually considered
doing that at one point in time, (mostly for doing exact comparisons between
Xyce and SPICE) but never got around to it.

The dx that the linear solver is solving for needs to be the total dx - ie
the sum of the dx from the Newton step and also the dx due to voltage
limiting.  SPICE applies the dx_{voltlim} in an implicit way, so SPICE's
unmodified system is in fact:

  J * x^{k+1} = -f + J * x_{voltlim}

Where x_{voltlim} is the solution after voltage limiters have been applied
to it.

If you add -J*x^{k} to both sides you get:

  J* dx_{total} = -f + J * dx_{voltlim}

  where:  dx_{voltlim} = x_{voltlim}-x^{k}
          dx_{total} = dx_{Newton} + dx_{voltlim}

SPICE applies the voltlim terms internally to each device load, by applying
the limiter functions to local variables that are derived from what is in
x_{k}.  Then, when it calculates J and f, it uses the "limited" values, and
essentially throws away x_{k}.

Anyway, this is all a long way of saying that what you are doing should
work.  And, I wouldn't call it "cheating" ;  doing it your way is a lot
faster than rewriting every device model!

Also regarding your question on numerical errors;  there will of course be
roundoff differences between what you are now doing and SPICE's original
form.  However, those roundoff differences won't necessarily be more or less
accurate.  In particular, if you want to use iterative linear solvers (you
probably don't...), iterative solvers will generally be happier if you are
solving for dx, as the values in dx will tend to be small, especially as the
solution gets near convergence.

The main reason we used the DX formulation, however, is that this is what
most nonlinear solvers do, and we planned to use  the NOX solver library.

By the way, the math document is mostly written for folks who have numerical
backgrounds but haven't ever tried to understand circuit simulation before;
it doesn't have much in it that would be pertinent to making a circuit
simulator faster or more robust.

Eric Keite
erkeite at sandia.gov

On 7/18/07 11:59 AM, "Neal Parsons" <nparsons at umich.edu> wrote:

> Hi Robert (etc.),
> I have a few questions/ observations about the xyce math formulation
> First, in the Xyce Math document, it describes reformulating the
> equations in xyce to solve for delta X instead of X.  I'm wondering
> "how" you do that internally in the code and if our approach is sound.
> We are solving for delta X instead of X, but we aren't doing in each
> devices' load routine.  We are doing it by "cheating".  Specifically,
> our strategy is to generate J and the RHS with Spice's unmodified
> device Load routines, then we are subtracting J * X[k] from the RHS to
> get "f".  So the first question I have is whether this approach is
> inferior to modifying each device's DevLoad routine to directly place
> "f" (or is it "-f") in the RHS.
> I'm concerned about mathematical errors related to the magnitudes of
> the terms being subtracted (e.g. 5.000000000000000001 - 5.0 = 0).
> Second, regarding voltage limiting, the math article (p. 27) suggests
> that we need to add J * (delta X hat)[k+1] to "-f".  But with our
> "cheat", this modification to "-f" seems to be unnecessary.  Perhaps
> this is because when the RHS is calculated by spice in the "Spice" way,
> it already takes into account the voltage limiting?  I'm not sure about
> that.  But if you have any insight into these issues, I'd appreciate
> hearing it.
> --
> Neal Parsons (and Jon Engelbert, my boss)
> University of Michigan
> Mechanical Engineering
> Cell: 517-896-2171
> Quoting Neal Parsons <nparsons at umich.edu>:
>> Thank you all for your responses. Since our SPICE package generally
>> deals with smaller circuits, it sounds like Amesos/KLU solver would
>> be better than the iterative AztecOO solver. I'm working for a very
>> small software company for the summer that currently uses the Berkely
>> SPICE engine, which fails roughly 1/3 of our NCSU benchmark test
>> suite. The motivation for integrating Trilinos libraries comes from
>> reading about XYCE's use of Trilinos.  We are hoping to get better
>> convergence and faster results.
>> Our solvers were updated to solve J*deltaX(k+1) = -f, as Xyce does.
>> We initially used AztecOO Line Search Based solver with ILU
>> preconditioning for our solver. This gave accurate operating point
>> results, but as noted, was generally slow for transient computations.
>> After wrestling with AztecOO for several weeks to no avail and given
>> the advice here, it appears that my next step would be to change-over
>> from AztecOO to Amesos/KLU. I just want to make sure that this
>> implementation will be beneficial to our application. Also, any other
>> suggestions might be helpful as well. Again, thanks for your help.
>> Neal Parsons
>> Quoting "Hoekstra, Robert J" <rjhoeks at sandia.gov>:
>>> As Mike mentioned, KLU (by Tim Davis, also the originator of the
>>> aforementioned UMFPack)
>>> which is the default solver in the Trilinos/Amesos package has specific
>>> optimizations for
>>> circuit problems and should outperform the more generic tools (UMFPack
>>> and SuperLU) for
>>> sparse direct solves.  As mentioned by both Mike and Randy, it is very
>>> difficult to get
>>> iterative sparse methods to perform well on this class of problems.  For
>>> the Xyce simulator,
>>> (Sandia's in house tool) we use GMRES with ILU(k) preconditioning only
>>> for very large scale
>>> problems (approaching a million or more devices).  If you are bent on
>>> demonstrating an
>>> iterative method rather than performance our default usage is along the
>>> lines of:
>>> AztecOO GMRES
>>> EpetraExt Singleton Filtering
>>> Ifpack ILU(k) w/ RCM Reordering
>>> Just using AztecOO's ILU(t) may do the trick but I am not sure if you
>>> will get the RCM
>>> reordering by default and it's bandwidth minimization is critical to the
>>> performance of
>>> the preconditioner.  You can probably neglect the singleton filter
>>> unless you have a very
>>> large distributed problem but not the bandwidth minimizing reorder
>>> (approx. minimum degree
>>> works even better than RCM and is what Tim Davis uses in KLU but it is
>>> not available in Ifpack
>>> or AztecOO).
>>> Robert Hoekstra
>>> ____________________________
>>> Electrical & Microsystems Modeling
>>> Sandia National Laboratories
>>> P.O. Box 5800 / MS 0316
>>> Albuquerque, NM 87185
>>> phone: 505-844-7627
>>> fax:      505-284-5451
>>> e-mail: rjhoeks at sandia.gov
>>> web: http://www.cs.sandia.gov
>>> -----Original Message-----
>>> From: trilinos-users-bounces at software.sandia.gov
>>> [mailto:trilinos-users-bounces at software.sandia.gov] On Behalf Of Randall
>>> Bramley
>>> Sent: Thursday, July 12, 2007 12:37 PM
>>> To: Heroux, Michael A
>>> Cc: trilinos-users at software.sandia.gov
>>> Subject: Re: [Trilinos-Users] Slow AztecOO solve time
>>>>> I'm a student, and a beginner with Trilinos trying to use the AztecOO
>>>>> solver in a SPICE package.   I am getting relatively slow results,
>>> Circuit simulation problems are notoriously difficult for iterative
>>> solvers with non-specialized preconditioners like ILUt, etc. It's not
>>> just a matter of a bad condition number, but the overall eigenvalue
>>> distributions tend to give worst-case properties for Krylov solvers.
>>> That would explain having a large number of iterations, but not
>>> necessarily the time spent in memory management - unless it is used on
>>> every iteration, or you have an lax dropping parameter set in the
>>> preconditioner (and hence many fill-in locations). So this may not be of
>>> help on your problem but be aware that just about any iterative solver
>>> package (e.g., Hypre, PETsC, ...) will in general perform slowly for
>>> SPICE.
>>> There may be some specialized application-specific preconditioners
>>> available now for circuit simulation, but I've been out of the game for
>>> too long to know. Still, that is something I'd recommend hunting for -
>>> it would well repay a couple of hours on Google if such a magic
>>> preconditioner has been developed. And if you do find such a thing,
>>> please post it back to this mail list - a lot of us would be interested
>>> and grateful.
>>> Way back when *I* was a student (shortly after the War of 1812) we
>>> typically had to bite the bullet and use a direct sparse solver rather
>>> than iterative methods. Or more accurately, a complete LU decomposition
>>> followed by a few steps of iterative refinement. Ain't no such thing as
>>> purely direct or purely iterative solvers nowadays and all practical
>>> solvers like Trilinos are essentially hybrids.
>>> Mike's recommendation is good, but I'd go further. You can get a quick
>>> estimate of the number of nonzeros (and sparsity pattern) of the LU
>>> factors when using SuperLU or UMFPACK. Take that number, multiply by 8
>>> to get the number of bytes, and see if that fits inside the amount of
>>> memory you have on the target machine. For parallel solves like
>>> SuperLUdist you'd want to do the same on a per process basis.  Using
>>> memory size can also be used to determine how much fill-in would be
>>> optimal for a solver, modulo adding in 3-10 vectors for the iterative
>>> solver.
>>> _______________________________________________
>>> Trilinos-Users mailing list
>>> Trilinos-Users at software.sandia.gov
>>> http://software.sandia.gov/mailman/listinfo/trilinos-users
>>> _______________________________________________
>>> Trilinos-Users mailing list
>>> Trilinos-Users at software.sandia.gov
>>> http://software.sandia.gov/mailman/listinfo/trilinos-users
>> _______________________________________________
>> Trilinos-Users mailing list
>> Trilinos-Users at software.sandia.gov
>> http://software.sandia.gov/mailman/listinfo/trilinos-users
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users

No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.476 / Virus Database: 269.10.6/902 - Release Date: 7/15/2007
2:21 PM

No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.476 / Virus Database: 269.10.6/902 - Release Date: 7/15/2007
2:21 PM

More information about the Trilinos-Users mailing list