[Trilinos-Users] MinTraceDavidson

Sander Schaffner ssander at student.ethz.ch
Fri Feb 27 08:15:01 MST 2015


Hey Alicia

Here is the timing output:

                                                     TimeMonitor results 
over 32 processors

Timer Name MinOverProcs         MeanOverProcs        MaxOverProcs 
MeanOverCallCounts
----------------------------------------------------------------------------------------------------------------------------------------------
Anasazi: PseudoBlockMinres::Add Multivectors           28.84 
(1.65e+04)     59.7 (1.65e+04)      79.05 (1.65e+04)     0.003618 (1.65e+04)
Anasazi: PseudoBlockMinres::Apply Operator             686.7 
(3328)         757.4 (3328)         828.9 (3328)         0.2276 (3328)
Anasazi: PseudoBlockMinres::Apply Preconditioner       0 
(0)                0 (0)                0 (0)                0 (0)
Anasazi: PseudoBlockMinres::Assignment (no locking)    36.22 
(2.314e+04)    52.61 (2.314e+04)    61.97 (2.314e+04)    0.002274 
(2.314e+04)
Anasazi: PseudoBlockMinres::Compute Dot Product        101.5 
(6628)         176 (6628)           251.8 (6628)         0.02655 (6628)
Anasazi: PseudoBlockMinres::Compute Norm               0 
(0)                0 (0)                0 (0)                0 (0)
Anasazi: PseudoBlockMinres::Lock Converged Vectors     5.643 
(509)          9.562 (509)          11.34 (509)          0.01879 (509)
Anasazi: PseudoBlockMinres::Scale Multivector          11.99 
(2.307e+04)    21.29 (2.307e+04)    30.53 (2.307e+04)    0.000923 
(2.307e+04)
Anasazi: PseudoBlockMinres::Total Time                 1077 
(28)            1077 (28)            1077 (28)            38.47 (28)
Anasazi: TraceMinBase::Computing residuals             2.649 
(29)           3.903 (29)           5.705 (29)           0.1346 (29)
----------------------------------------------------------------------------------------------------------------------------------------------
Anasazi: TraceMinBase::Direct solve                    10.02 
(29)           10.16 (29)           10.34 (29)           0.3504 (29)
Anasazi: TraceMinBase::Initialization                  465.9 
(18)           465.9 (18)           466 (18)             25.89 (18)
Anasazi: TraceMinBase::Local update                    195.1 
(58)           251.4 (58)           276.4 (58)           4.335 (58)
Anasazi: TraceMinBase::Operation M*x                   0 
(0)                0 (0)                0 (0)                0 (0)
Anasazi: TraceMinBase::Operation Op*x                  6.102 
(29)           6.891 (29)           7.707 (29)           0.2376 (29)
Anasazi: TraceMinBase::Orthogonalization               76.26 
(29)           76.35 (29)           76.45 (29)           2.633 (29)
Anasazi: TraceMinBase::Solving saddle point problem    1077 
(28)            1077 (28)            1077 (28)            38.47 (28)
Anasazi: TraceMinBase::Sorting eigenvalues             0.022 
(29)           0.03183 (29)         0.06859 (29)         0.001098 (29)
Anasazi: TraceMinBaseSolMgr locking                    471.6 
(17)           471.6 (17)           471.6 (17)           27.74 (17)
Anasazi: TraceMinBaseSolMgr restarting                 0 
(0)                0 (0)                0 (0)                0 (0)
----------------------------------------------------------------------------------------------------------------------------------------------
Anasazi: TraceMinBaseSolMgr::solve()                   2766 
(1)             2766 (1)             2766 (1)             2766 (1)
Anasazi: TraceMinRitzOp: *Petra::Apply()               503.9 
(3328)         576.9 (3328)         642.8 (3328)         0.1733 (3328)
Anasazi: TraceMinRitzOp::Apply()                       503.9 
(3328)         576.9 (3328)         642.8 (3328)         0.1734 (3328)

I guess the information is in Anasazi: PseudoBlockMinres: . Is it 
possible to deduce the # of iterations out of it?

Thanks

Sander

Am 27.02.2015 um 15:55 schrieb Alicia Klinvex:
> Hello Sander,
>
> I understand that you want some information about the inner 
> iterations.  I need the timer output so that I can point that 
> information out to you.  Does TraceMinDavidson print information about 
> its running time when solve() returns?  If not, you need to enable 
> that through your parameter list, e.g. MyPL.set ("Verbosity", 
> Anasazi::TimingDetails).
>
> - Alicia
>
> On Fri, Feb 27, 2015 at 9:45 AM, Sander Schaffner 
> <ssander at student.ethz.ch <mailto:ssander at student.ethz.ch>> wrote:
>
>     Hey
>
>     I'm just wondering how I can reach some information about the
>     inner iteration (number of iterations, time to solution, ...),
>     since the effect of the preconditioner should be seen there. Is
>     there a debug flag I have to set?
>
>     Best wishes,
>     Sander
>
>     Am 27.02.2015 um 15:38 schrieb Alicia Klinvex:
>>     Hello,
>>
>>     Can you send me the output you're seeing of the timers?  I'll
>>     help you interpret them.
>>
>>     Ugh.  "The number of operations Op*x"  You just reminded me that
>>     it's inaccurate, because it does not include any of the matvecs
>>     from the inner Krylov iterations.  Sorry about that.  If you want
>>     the total number of Krylov iterations, we can also get that
>>     information from the timers.
>>
>>     Best wishes,
>>     Alicia
>>
>>     On Fri, Feb 27, 2015 at 9:15 AM, Sander Schaffner
>>     <ssander at student.ethz.ch <mailto:ssander at student.ethz.ch>> wrote:
>>
>>         Hey Alicia
>>
>>         Thanks for the update on public Trilinos.
>>
>>         I'm at point where I'm measuring the performance of my code.
>>         I there any possibility to get the time of the inner
>>         interations (or did I miss this in the output?)? And is the
>>         number of iteration steps in the inner loop (solving the
>>         saddle point problem) indirectly given by "The number of
>>         operations Op*x"? For example if it raises from one iteration
>>         step to the next by 35 and I have a block size of 5, then it
>>         took 7 inner iteration steps?
>>
>>         Thanks
>>
>>         Sander
>>
>>         -------- Weitergeleitete Nachricht --------
>>         Betreff: 	Re: MinTraceDavidson
>>         Datum: 	Tue, 17 Feb 2015 17:09:18 +0100
>>         Von: 	Sander Schaffner <ssander at student.ethz.ch>
>>         <mailto:ssander at student.ethz.ch>
>>         An: 	Alicia Klinvex <aklinvex at purdue.edu>
>>         <mailto:aklinvex at purdue.edu>
>>
>>
>>
>>         the jumping eigenvalue problem never occured again.
>>
>>         I actually have two small questions:
>>
>>         1: If we run the solver with output I see the iterations step
>>         of the outer loop as I understand it. So I dont see how it
>>         finds the new vectors to enlarge the searchspace. Is there a
>>         way to turn that on?
>>
>>         2: So far I never got faster results if I activated a
>>         preconditioner. I tried ifpack2:diagonal, ifpack2:ilut, muelu
>>         (smoother gauss, coarse klu2) and a bandmatrix which I handed
>>         to ifpack2:ilut. Do you use preconditioners and if so which?
>>
>>         Thanks
>>
>>         Sander
>>
>>         Am 13.02.2015 um 18:44 schrieb Alicia Klinvex:
>>>         It should find the three 0 eigenvalues first, and it does. 
>>>         They just never get flagged for convergence if you monitor
>>>         the RELATIVE residual.  Anything relative to 0 is going to
>>>         be huge.
>>>
>>>         The block preconditioning should probably hit the repo next
>>>         week.
>>>
>>>         Are you still seeing the problem you were before where small
>>>         eigenvalues jump in out of nowhere?
>>>
>>>         - Alicia
>>>
>>>         On Fri, Feb 13, 2015 at 7:57 AM, Sander Schaffner
>>>         <ssander at student.ethz.ch <mailto:ssander at student.ethz.ch>>
>>>         wrote:
>>>
>>>             Hi Alica
>>>
>>>             The matrix A that I have given you has a slightly other
>>>             nullspace. Each value has to be corrected with the
>>>             atom_mass.dat vector. Just have a look at the
>>>             test_nullspace script i wrote today (also in the dropbox
>>>             folder). Sorry for the confusion! The point is this: In
>>>             the beginning I had to work on a matrix with no
>>>             dimensions (physically) and we changed this for the new
>>>             one. So each term in A is also corrected with the
>>>             atom_mass. That's why we have another nullspace now
>>>             (each 1 has to be multiplied by the square root of the
>>>             according entry of atom_mass.dat)
>>>
>>>             And I was also able to calculate many more eigenpairs.
>>>             I'm now on 300 in 3 hours with ILUT as a preconditioner.
>>>             And yes the smallest of your callculation is correct,
>>>             altough I'm a bit confused since it should find the 3
>>>             eigenvalues 0 first?
>>>
>>>             Is there any progress regarding other sattle point
>>>             solver for the public repository of TraceMinDavidson?
>>>
>>>             Best wishes,
>>>             Sander
>>>
>>>
>>>             Am 12.02.2015 um 20:34 schrieb Alicia Klinvex:
>>>>             Hello Sander,
>>>>
>>>>             I'm copying the user list, since we probably should
>>>>             have been doing that all along. This way, you can get
>>>>             help from all the Trilinos folks, not just me.  (I'm
>>>>             definitely the right person to ask about TraceMin, but
>>>>             I don't know much about Ifpack/ML, etc.)  For everyone
>>>>             who is not Sander, here is a recap of his situation: He
>>>>             wants to compute the smallest eigenpairs of a symmetric
>>>>             positive definite matrix, and he's been brave enough to
>>>>             try Anasazi's newest eigensolver, TraceMin-Davidson. He
>>>>             is having trouble with TraceMin-Davidson, as well as a
>>>>             few of the other eigensolvers.
>>>>
>>>>             I had a look at your matrix in Matlab, and it's a
>>>>             little unclear to me what the null space is. From your
>>>>             code, it looks like the basis is supposed to be [1 0 0
>>>>             1 0 0 1 0 0 ...; 0 1 0 0 1 0 0 1 0 ...; 0 0 1 0 0 1 0 0
>>>>             1]'. When I imported your matrix in Matlab and computed
>>>>             A*[your null space basis], I was getting a nonzero
>>>>             vector (and by nonzero, I mean its norm was 1.2685).
>>>>
>>>>             I ran TraceMin-Davidson without calling setAuxVecs, and
>>>>             I did not experience the behavior you did.  (150
>>>>             eigenpairs were locked, but the three vectors of the
>>>>             null space never converged...which isn't surprising,
>>>>             considering that I asked for a small relative residual,
>>>>             and for those vectors, the relative residual is defined
>>>>             as absolute residual / 0.)  Is it possible that your
>>>>             basis for the null space is incorrect, or did I misread
>>>>             your code?
>>>>
>>>>             Best wishes,
>>>>             Alicia
>>>>
>>>>             PS: The smallest eigenvalue TraceMin-Davidson found was
>>>>             1.869423e-2.  Is the matrix you sent me the same one
>>>>             you used to generate your results?
>>>>
>>>>             On Wed, Feb 4, 2015 at 2:49 PM, Sander Schaffner
>>>>             <ssander at student.ethz.ch
>>>>             <mailto:ssander at student.ethz.ch>> wrote:
>>>>
>>>>                 Of course: matrixFile.mtx -> its about 2gb
>>>>
>>>>                 Best wishes
>>>>
>>>>                 Sander
>>>>
>>>>                 Am 04.02.2015 um 20:21 schrieb Alicia Klinvex:
>>>>>                 Hello Sander,
>>>>>
>>>>>                 I don't see the matrices in your dropbox. Can you
>>>>>                 save them as [something].mtx?
>>>>>
>>>>>                 Thank you,
>>>>>                 Alicia
>>>>>
>>>>>                 On Wed, Feb 4, 2015 at 2:01 PM, Sander Schaffner
>>>>>                 <ssander at student.ethz.ch
>>>>>                 <mailto:ssander at student.ethz.ch>> wrote:
>>>>>
>>>>>                     Hey!
>>>>>
>>>>>                     It seems that it is not reproducable. I ran it
>>>>>                     again 2 times for 8 hours and once I got 129
>>>>>                     eigenpairs and the other time 230. Meanwhile
>>>>>                     the matrix is in the dropboxfolder in a
>>>>>                     MatrixMarket format. If you could feed this
>>>>>                     matrix once into your setup it would be nice.
>>>>>
>>>>>                     Thanks
>>>>>
>>>>>                     Sander
>>>>>
>>>>>                     Am 02.02.2015 um 16:18 schrieb Alicia Klinvex:
>>>>>>                     Hello,
>>>>>>
>>>>>>                     Can you save your matrix as a MatrixMarket or
>>>>>>                     Matlab file?  Also, can you turn on the
>>>>>>                     debugging output and rerun the program?
>>>>>>
>>>>>>                     Thank you,
>>>>>>                     Alicia
>>>>>>
>>>>>>                     On Mon, Feb 2, 2015 at 6:21 AM, Sander
>>>>>>                     Schaffner <ssander at student.ethz.ch
>>>>>>                     <mailto:ssander at student.ethz.ch>> wrote:
>>>>>>
>>>>>>                         Hey Alicia
>>>>>>
>>>>>>                         I hope this works for you:
>>>>>>
>>>>>>                         https://www.dropbox.com/sh/ouw90m0m00s6kyn/AAAilijMstKYKa_7Mx4hQaQ9a?dl=0
>>>>>>
>>>>>>                         I had to write my own TpetraExt for the
>>>>>>                         functions I need. Just copy all the files
>>>>>>                         in the same directory and change the
>>>>>>                         trilinos-path in the Makefile. In the
>>>>>>                         sbatch file you see how I called the
>>>>>>                         programm after compilation.
>>>>>>
>>>>>>                         Is this okey for you? Thanks alot
>>>>>>
>>>>>>                         Best wishes,
>>>>>>                         Sander
>>>>>>
>>>>>>
>>>>>>
>>>>>>                         Am 30.01.2015 um 18:52 schrieb Alicia
>>>>>>                         Klinvex:
>>>>>>>                         Well that's not good!  It looks like
>>>>>>>                         there may be a problem with the
>>>>>>>                         restarting.  (I'll have a look at it.)
>>>>>>>
>>>>>>>                         My update isn't in the public repo yet,
>>>>>>>                         but thanks for reminding me. I've been
>>>>>>>                         running tests with it, and if you're
>>>>>>>                         interested in using preconditioners, the
>>>>>>>                         block diagonal preconditioning patch
>>>>>>>                         will be of interest to you.  (Mark
>>>>>>>                         Hoemmen generally helps me out with
>>>>>>>                         these things, i.e. wrapping sensitive
>>>>>>>                         bits of code so I don't break the
>>>>>>>                         Trilinos configuration process, but he
>>>>>>>                         was on vacation for most of January. 
>>>>>>>                         I'll ping him to see if he can do it now.)
>>>>>>>
>>>>>>>                         Can you send me your matrix and driver,
>>>>>>>                         so I can see what's going on?  I suspect
>>>>>>>                         the solver manager accidentally removed
>>>>>>>                         the wrong vector from the subspace when
>>>>>>>                         it locked a vector.
>>>>>>>
>>>>>>>                         Best wishes,
>>>>>>>                         Alicia
>>>>>>>
>>>>>>>                         PS: as far as getting unconverged Ritz
>>>>>>>                         vectors goes...it's complicated. It's
>>>>>>>                         SUPPOSED to be difficult to access them,
>>>>>>>                         because they aren't going to be accurate
>>>>>>>                         anyway.  What you have to do if you
>>>>>>>                         desperately want to get them is create
>>>>>>>                         your own custom debug status test. I can
>>>>>>>                         give you more information about this if
>>>>>>>                         you like, but I don't think the
>>>>>>>                         unconverged Ritz vectors will give you
>>>>>>>                         much information. You could always turn
>>>>>>>                         on the Anasazi::Debug output option,
>>>>>>>                         which will give you information about
>>>>>>>                         loss of orthogonality and whatnot.
>>>>>>>
>>>>>>>                         On Fri, Jan 30, 2015 at 9:51 AM, Sander
>>>>>>>                         Schaffner <ssander at student.ethz.ch
>>>>>>>                         <mailto:ssander at student.ethz.ch>> wrote:
>>>>>>>
>>>>>>>                             Hi Alicia
>>>>>>>
>>>>>>>                             There is a new strange thing that
>>>>>>>                             happened to me: I was able to
>>>>>>>                             calculate 400 eigenpairs of matrix A
>>>>>>>                             (dim = 150k, nnz = 60e9) in 4.5
>>>>>>>                             hours. Then I received a updated
>>>>>>>                             matrix. The structure stayed the
>>>>>>>                             same, but the values got an update
>>>>>>>                             by either the square root of mass a
>>>>>>>                             or mass b. Therefore the values
>>>>>>>                             changed from 1e1 to 1e27. So I had
>>>>>>>                             to rescale, since the solver did not
>>>>>>>                             work for the values at 1e27.  I have
>>>>>>>                             now a matrix B with the same
>>>>>>>                             structure than A but other values
>>>>>>>                             that have more or less the same size.
>>>>>>>
>>>>>>>                             I first calculated 100 eigenpairs
>>>>>>>                             which worked in 20min. But I did not
>>>>>>>                             give me any results for 150
>>>>>>>                             eigenpairs after 8 hours. The solver
>>>>>>>                             doesnt converge. I had a look at the
>>>>>>>                             at the output. After finding 77
>>>>>>>                             eigenpairs (I start with 3 auxiliary
>>>>>>>                             vectors) where the smallest
>>>>>>>                             eigenvalue is 1.92e-4 it suddenly
>>>>>>>                             has a new ritz-value of  1.2e-7
>>>>>>>                             after iteration step 23 (it somehow
>>>>>>>                             looks af it whould smugling itself
>>>>>>>                             into the list). And this value must
>>>>>>>                             be wrong since we should not drop
>>>>>>>                             behind 1.92e-4. Do you have any idea
>>>>>>>                             how to correct this or why this
>>>>>>>                             happens? In the lower part of the
>>>>>>>                             mail you find two snippets of the
>>>>>>>                             TraceMinBase Solver Status. I
>>>>>>>                             thought I should have a look at the
>>>>>>>                             vector which belongs to this value,
>>>>>>>                             but I did not find how to configure
>>>>>>>                             the solver so that it stops after a
>>>>>>>                             certain number of iterations. Is
>>>>>>>                             there any possibility to get the
>>>>>>>                             ritz-vectors altough the solver did
>>>>>>>                             not finish?
>>>>>>>
>>>>>>>                             At the moment I'm using ILUT as a
>>>>>>>                             preconditioner. I'm working to get
>>>>>>>                             MueLu to run and use this to look if
>>>>>>>                             I can get any improvements.
>>>>>>>
>>>>>>>                             Then there is another question: Did
>>>>>>>                             your update made it into the public
>>>>>>>                             repo? I would love to try other
>>>>>>>                             saddle point solvers:)
>>>>>>>
>>>>>>>                             Best wishes,
>>>>>>>
>>>>>>>                             Sander
>>>>>>>
>>>>>>>                             ================================================================================
>>>>>>>
>>>>>>>                             TraceMinBase Solver Status
>>>>>>>
>>>>>>>                             The solver is initialized.
>>>>>>>                             The number of iterations performed is 23
>>>>>>>                             The block size is         50
>>>>>>>                             The number of blocks is   30
>>>>>>>                             The current basis size is 450
>>>>>>>                             The number of auxiliary vectors is 80
>>>>>>>                             The number of operations Op*x   is 1200
>>>>>>>                             The number of operations M*x    is 0
>>>>>>>
>>>>>>>                             CURRENT EIGENVALUE ESTIMATES
>>>>>>>                             Eigenvalue  Residual(M)  Residual(2)
>>>>>>>                             --------------------------------------------------------------------------------
>>>>>>>                             5.476905e-04        not current
>>>>>>>                             6.561567e-06
>>>>>>>                             8.015696e-04        not current
>>>>>>>                             6.822561e-06
>>>>>>>                             8.067189e-04        not current
>>>>>>>                             8.178570e-06
>>>>>>>                             1.043989e-03        not current
>>>>>>>                             2.537229e-05
>>>>>>>                             1.063398e-03        not current
>>>>>>>                             1.999694e-05
>>>>>>>                             1.077780e-03        not current
>>>>>>>                             2.646044e-05
>>>>>>>                             1.148442e-03        not current
>>>>>>>                             3.388691e-05
>>>>>>>                             1.189898e-03        not current
>>>>>>>                             1.205903e-04
>>>>>>>
>>>>>>>
>>>>>>>                             .
>>>>>>>                             .
>>>>>>>                             .
>>>>>>>
>>>>>>>                             ================================================================================
>>>>>>>
>>>>>>>                             TraceMinBase Solver Status
>>>>>>>
>>>>>>>                             The solver is initialized.
>>>>>>>                             The number of iterations performed is 24
>>>>>>>                             The block size is         50
>>>>>>>                             The number of blocks is   30
>>>>>>>                             The current basis size is 300
>>>>>>>                             The number of auxiliary vectors is 81
>>>>>>>                             The number of operations Op*x   is 1250
>>>>>>>                             The number of operations M*x    is 0
>>>>>>>
>>>>>>>                             CURRENT EIGENVALUE ESTIMATES
>>>>>>>                             Eigenvalue  Residual(M)  Residual(2)
>>>>>>>                             --------------------------------------------------------------------------------
>>>>>>>                             1.266381e-07        not current  not
>>>>>>>                             current
>>>>>>>                             5.477028e-04        not current  not
>>>>>>>                             current
>>>>>>>                             8.015761e-04        not current  not
>>>>>>>                             current
>>>>>>>                             8.067281e-04        not current  not
>>>>>>>                             current
>>>>>>>                             1.044027e-03        not current  not
>>>>>>>                             current
>>>>>>>                             1.063419e-03        not current  not
>>>>>>>                             current
>>>>>>>                             1.077814e-03        not current  not
>>>>>>>                             current
>>>>>>>                             1.148475e-03        not current  not
>>>>>>>                             current
>>>>>>>                             1.190137e-03        not current  not
>>>>>>>                             current
>>>>>>>                             1.210246e-03        not current  not
>>>>>>>                             current
>>>>>>>                             1.230605e-03        not current  not
>>>>>>>                             current
>>>>>>>                             1.233197e-03        not current  not
>>>>>>>                             current
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://software.sandia.gov/pipermail/trilinos-users/attachments/20150227/216ace0c/attachment-0001.html>


More information about the Trilinos-Users mailing list