[Trilinos-Users] [EXTERNAL] Re: Copying Amesos2 Solver and Tpetra MultiVectors

Siva Rajamanickam srajama at sandia.gov
Tue Jun 10 13:28:33 MDT 2014


David,
   This is feature request will not give you a scalable way as your memory usage 
will jump O((|L| + |U|) * #threads).

  The data that is held in Amesos2 can be copied to multiple solvers, yes. 
However, remember that the underlying solvers themselves do not expect this 
usage and usually hold workspace and other data structures. Some don't even 
construct a complete L, instead provide access to solve() methods.

  It appears you are planning to implement a deep_copy () for a specific solver 
(Superlu_MT), from another thread to Mark H.. Do let us know if you need any 
help. It will be very hard to do a generic deep_copy () though.

Thank


On 06/08/2014 11:37 PM, D H wrote:
> I have experimented with this a little bit, and SuperLU_MT seems to be working pretty well as my chosen solver implementation.  If I use one solver per thread, I can do solves in parallel and get accurate results / no crashes (so, things are acting like they're thread safe, at least on the surface).  I'm not totally sure about the thread scalability; after each thread's solver has done its LU factorization and is just repeatedly doing triangular solves, my CPU usage seems to only go up to about 70%.  That could be for a number of different reasons, so definitely take it with a grain of salt!, but in any case, things seem to be working pretty well.
>
> I did think about some MPI options, but my whole simulation is parallelized based on OpenMP, so it would require a decent amount of effort to rewire everything to be MPI-based.  In light of this, I think SuperLU_MT is the right choice, since it will allow my simulation to compute the LU factorization using multiple threads (internally parallel), then I can simultaneously do triangular solves from different threads (externally parallel, even if it's internally serial that's fine).
>
> I guess my only remaining concern is one I will wait to hear from some Amesos2 developers about, namely this question of getting the results of a factorization out of one solver instance and giving them to another solver instance.  Browsing through the source code, I can see that the computed "L" and "U" are tucked away in the private data_ member of the implementation-specific Amesos2::Superlumt< Matrix, Vector > class.  Looking at this, it doesn't look like there's an obvious way to achieve what I'm trying to do (I guess this is not a freqeuent feature request!), but hopefully some of the developers can shed some light on a way to do this, even if I have to tweak the source code a bit.
>
> Thanks again!
>
> David
>
> ----- Original Message -----
> From: "Mark Hoemmen" <mhoemme at sandia.gov>
> To: "D H" <mrhyde at stanford.edu>, trilinos-users at software.sandia.gov
> Sent: Sunday, June 8, 2014 12:23:26 PM
> Subject: Re: [Trilinos-Users] Copying Amesos2 Solver and Tpetra MultiVectors
>
> On 6/7/14, 11:57 PM, "D H" <mrhyde at stanford.edu> wrote:
>> Thanks very much for your reply!  I really appreciate your help.
> Glad to help out!
>
>> It's good to know about the deep copying for Tpetra::MultiVector.  I do
>> have one follow-up, though.  In my use case, I would like to create a
>> solver and compute the LU factorization (using SuperLU_MT), and then once
>> that is done, I would like to simultaneously solve the system for a large
>> number of right hand sides (using OpenMP).  I don't know all the
>> right-hand sides before I start solving; the right-hand sides are
>> computed one at a time on the fly on different threads, and each right
>> hand side is destroyed as soon as the system is solved with it, so I
>> can't just tell the solver to solve for an NxM right hand side.  Is
>> solver->solve thread-safe, and will it run in parallel ... ?
> Amesos2::Solver does not currently promise that solve() is thread safe.
> There are two reasons for this:
>
>    1. The underlying solver (e.g., SuperLU_MT, Cholmod, ...) might not be
> thread safe.
>
> Just because a solver implementation (e.g., SuperLU_MT) uses threads
> _inside_, doesn't necessarily mean that it's safe to call multiple
> instances of the solver concurrently.
>
>    2. Teuchos::RCP reference count updates are not currently thread safe.
>
> This would not be hard to fix, but we haven't had a chance to do it.
> Also, just because something is thread safe, doesn¹t mean that it's thread
> scalable.  (I know you understand the difference by how you asked the
> question :-) .)  Anything that works like Teuchos::RCP -- including
> boost::shared_ptr and std::shared_ptr (C++11) -- can never be thread
> scalable, because atomic reference count updates are a serialization
> bottleneck.  Any copy constructor or assignment operator call for any of
> these "smart pointers" updates its reference count.  This is why we've
> resisted making Teuchos::RCP thread safe.  (Another reason is that we get
> very few requests for this functionality, so we haven't prioritized it.)
>
> For your application, RCP reference count updates _should_ not be a
> bottleneck, but we haven't done an experiment to test this, as far as I
> know.
>
> Whether solve() is parallel _inside_, depends on the solver
> implementation.  I would have to look at SuperLU_MT to be sure, but I'm
> guessing that it only parallelizes the LU factorization, not the
> triangular solves.  Parallelizing triangular solves effectively incurs
> set-up overhead and doesn't always scale well.
>
>
>> If one solver per thread is a good way to go, though, ...
> See Point #1 above.  It's important to check whether a solver
> implementation promises that concurrent use of a factorization is thread
> safe.  Amesos2 just wraps solvers; it doesn't implement them (currently).
>
>> I would appreciate any insights into how one might extract the LU
>> factorization from one solver instance and put it into another, even if
>> it's not exactly a deep copy.
> I'll have to defer to the Amesos2 developers to see whether this is
> possible, and if so, how to do it.
>
>> P.S. Thanks for the tip regarding Tpetra template parameters!
> Glad to help! :-)  We may eventually even make the first template
> parameter optional, so that you could write e.g., Map<> and Vector<>.
> However, we haven't done that yet.
>
> If you would like to do concurrent solves, another approach would be to
> run them in different MPI processes, and use Tpetra's Import / Export
> functionality to move data around between processes.
>
> Thanks!
> mfh
>
>> ----- Original Message -----
>> From: "Mark Hoemmen" <mhoemme at sandia.gov>
>> To: trilinos-users at software.sandia.gov
>> Sent: Saturday, June 7, 2014 10:30:47 PM
>> Subject: Re: [Trilinos-Users] Copying Amesos2 Solver and Tpetra
>> MultiVectors
>>
>> On 6/7/14, 12:00 PM, "trilinos-users-request at software.sandia.gov"
>> <trilinos-users-request at software.sandia.gov> wrote:
>>> Date: Sat, 7 Jun 2014 10:29:55 -0700 (PDT)
>>> From: D H <mrhyde at stanford.edu>
>>> To: trilinos-users at software.sandia.gov
>>> Subject: [Trilinos-Users] Copying Amesos2 Solver and Tpetra
>>> 	MultiVectors
>>> Message-ID:
>>> 	<2083721164.11288811.1402162195681.JavaMail.zimbra at stanford.edu>
>>> Content-Type: text/plain; charset=utf-8
>>>
>>> Hi everyone,
>>>
>>> I am new to working with Trilinos and have what is hopefully an easy
>>> C++/Trilinos question.
>>>
>>> I'm working with Amesos2 to solve linear systems, and I have code that is
>>> structured like:
>>>
>>> class A {
>>> ...
>>> public B b_instance;
>>> ...
>>> }
>>>
>>> class B {
>>> ...
>>> public Teuchos::RCP<Amesos2::Solver<
>>> Tpetra::CrsMatrix<double,int,int,Tpetra::DefaultPlatform::DefaultPlatform
>>> T
>>> ype::NodeType> ,
>>> Tpetra::MultiVector<double,int,int,Tpetra::DefaultPlatform::DefaultPlatfo
>>> r
>>> mType::NodeType> > > amesos2_solver;
>>> public Teuchos::RCP<
>>> Tpetra::MultiVector<double,int,int,Tpetra::DefaultPlatform::DefaultPlatfo
>>> r
>>> mType::NodeType> > amesos2_X;
>>> public Teuchos::RCP<
>>> Tpetra::MultiVector<double,int,int,Tpetra::DefaultPlatform::DefaultPlatfo
>>> r
>>> mType::NodeType> > amesos2_B;
>>> ...
>>> }
>>>
>>> In my program, I have several instances of class A.  For the first
>>> instance, I use its b_instance to compute the numeric (LU) factorization
>>> of a sparse matrix, which is another class variable of B.  At that point,
>>> amesos2_X and amesos2_B are just random Nx1 vectors, but they have been
>>> associated with amesos2_solver using "amesos2_solver =
>>> Amesos2::create<MAT,MV>(solver_name, my_matrix, amesos2_X, amesos2_B);".
>>>
>>> So now, what I would like to do is to copy the amesos2_solver, amesos2_X,
>>> and amesos2_B variables into the b_instances of my other instances of
>>> class A.  I don't want to just create copies of pointers that are all
>>> pointing to the same place - I would like separate variables that contain
>>> the same information.  The motivation for this is so I can create several
>>> independent solvers with the same matrix but only have to compute the LU
>>> factorization once.
>>>
>>> Can you recommend what is the best way to copy these variabes, or the
>>> underlying computed factorization, to other instances?  I'd be very
>>> grateful for any help you can provide - thanks very much!
>> It is certainly possible to make a deep copy of a Tpetra::MultiVector.
>> You may use operator= to do that for now; in the development version of
>> Trilinos and in future releases, you may use the Tpetra::deep_copy
>> nonmember function.
>>
>> As far as I know, Amesos2 does not currently give you a way to make a deep
>> copy of a Solver instance.  However, for your use case, you don't need to
>> do this.  You could simply share one Amesos2::Solver instance by RCP among
>> all your class instances, and call it with different vectors by using the
>> two-argument version of solve():
>>
>> solver->solve (x, b);
>>
>> The input arguments of the two-argument version of solve() overwrite any
>> setB() or setX() inputs.
>>
>> btw, since you are using all default values of the Tpetra template
>> parameters, you may omit all but one of the Tpetra template parameters.
>> For example:
>>
>> Tpetra::CrsMatrix<double> A;
>> Tpetra::MultiVector<double> x;
>>
>> Please let us know if you have any further questions.
>> mfh
>>
>> _______________________________________________
>> Trilinos-Users mailing list
>> Trilinos-Users at software.sandia.gov
>> https://software.sandia.gov/mailman/listinfo/trilinos-users
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> https://software.sandia.gov/mailman/listinfo/trilinos-users



More information about the Trilinos-Users mailing list