[Trilinos-Users] trilinos on Leopard - ML failure

Jonathan Hu jhu at sandia.gov
Tue Mar 18 15:04:52 MDT 2008


Hi John,

    The "aggregation WARNING" message indicates that ML is running out 
of memory while grabbing a single row of the coarse level 4 matrix.   
I'll have a look as to why this might be happening.  In the mean time, 
could you try switching to the hybrid aggregation scheme -- this can be 
turned on with the parameter list option "aggregation type: Uncoupled-MIS".

Regards,
Jonathan Hu
> Message: 2
> Date: Mon, 17 Mar 2008 14:29:43 -0600 (MDT)
> From: "John R. Cary" <cary at colorado.edu>
> Subject: [Trilinos-Users] trilinos on Leopard - ML failure
> To: trilinos-users at software.sandia.gov
> Message-ID: <Pine.LNX.4.64.0803171426430.5296 at localhost.localdomain>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
>
> After upgrading to OS X Leopard and building trilinos-8.0.5
> ML is now crashing on me for sufficiently large problem size.
>
> (uname gives Darwin Kernel Version 9.2.0: Tue Feb  5 16:13:22 PST 2008; root:xnu-1228.3.13~1/RELEASE_I386 i386)
>
> The transcript below is for debug printing turned up all the way.  This
> is an implicit solve for 4000 cells with a 3 component field.
> At 3000 cells there is no crash.
>
> I have not worked my way through all the combinations to
> figure out what is going on.  I have managed to downgrade
> to trilinos-7.0.9.   I got this to build by
>
>    1. Turning off pytrilinos
>    2. Configuring with --with-ldflags="-framework Accelerate -Wl,-u,_munmap -Wl,-multiply_defined,suppress"
>
> The full configure line is
>
> ../configure --prefix=/usr/local/trilinos-7.0.9 --disable-default-packages --enable-amesos --enable-ml --enable-aztecoo --enable-epetraext --enable-epetra --enable-triutils --enable-teuchos --enable-ifpack --enable-galeri CXX=g++ F77=gfortran --with-ldflags="-framework Accelerate -Wl,-u,_munmap -Wl,-multiply_defined,suppress"
>
>    3. Fixing the AztecOO_MatlabInput.exe link line, which contains both
>       -lepetra and ../../../epetra/src/libepetra.a, by removing the
>       latter. That is, cd to packages/aztecoo/example/AztecOO_MatlabInput,
>       type "make", which will fail, then copy the link line directly
>       after removing ../../../epetra/src/libepetra.a from it..
>
> With 7.0.9 the program runs to completion.
>
> I have also checked parallel now.  It works with 7.0.9 as well.
>
> I realize that it is not too much help to you until I can come
> up with a model problem that shows the failure, but I thought
> I would write in case you have some offhand suggestions.
>
> John Cary
>
>
> Entering ML_Gen_MGHierarchy_UsingAggregation
> **************************************************************
> * ML Aggregation information                                 *
> ==============================================================
> ML_Aggregate : ordering           = natural.
> ML_Aggregate : min nodes/aggr     = 2
> ML_Aggregate : max neigh selected = 0
> ML_Aggregate : attach scheme      = MAXLINK
> ML_Aggregate : strong threshold   = 3.000000e-02
> ML_Aggregate : P damping factor   = 1.333333e+00
> ML_Aggregate : number of PDEs     = 1
> ML_Aggregate : number of null vec = 1
> ML_Aggregate : smoother drop tol  = 3.000000e-02
> ML_Aggregate : max coarse size    = 32
> ML_Aggregate : max no. of levels  = 30
> **************************************************************
> ML_Aggregate_Coarsen (level 0) begins
> ML_Aggregate_CoarsenUncoupled : current level = 0
> ML_Aggregate_CoarsenUncoupled : current eps = 3.000000e-02
> Aggregation(UVB) : Total nonzeros = 223104 (Nrows=18144)
> Aggregation(UC) : Phase 0 - no. of bdry pts  = 6824
> Aggregation(UC) : Phase 1 - nodes aggregated = 8428 (18144)
> Aggregation(UC) : Phase 1 - total aggregates = 2900
> Aggregation(UC_Phase2_3) : Phase 1 - nodes aggregated = 8428
> Aggregation(UC_Phase2_3) : Phase 1 - total aggregates = 2900
> Aggregation(UC_Phase2_3) : Phase 2a- additional aggregates = 0
> Aggregation(UC_Phase2_3) : Phase 2 - total aggregates = 8660
> Aggregation(UC_Phase2_3) : Phase 2 - boundary nodes   = 1064
> Aggregation(UC_Phase2_3) : Phase 3 - leftovers = 5760 and singletons = 5760
> Gen_Prolongator (level 0) : Max eigenvalue = 1.368026e+00
>
> Prolongator/Restriction smoother (level 0) : damping factor #1 = 9.746400e-01
> Prolongator/Restriction smoother (level 0) : ( = 1.333333e+00 / 1.368026e+00)
>
> ML_Aggregate_Coarsen (level 1) begins
> ML_Aggregate_CoarsenUncoupled : current level = 1
> ML_Aggregate_CoarsenUncoupled : current eps = 1.500000e-02
> Aggregation(UVB) : Total nonzeros = 183121 (Nrows=8660)
> Aggregation(UC) : Phase 0 - no. of bdry pts  = 5760
> Aggregation(UC) : Phase 1 - nodes aggregated = 2900 (8660)
> Aggregation(UC) : Phase 1 - total aggregates = 962
> Aggregation(UC_Phase2_3) : Phase 1 - nodes aggregated = 2900
> Aggregation(UC_Phase2_3) : Phase 1 - total aggregates = 962
> Aggregation(UC_Phase2_3) : Phase 2a- additional aggregates = 0
> Aggregation(UC_Phase2_3) : Phase 2 - total aggregates = 6722
> Aggregation(UC_Phase2_3) : Phase 2 - boundary nodes   = 0
> Aggregation(UC_Phase2_3) : Phase 3 - leftovers = 5760 and singletons = 5760
> Gen_Prolongator (level 1) : Max eigenvalue = 1.699324e+00
>
> Prolongator/Restriction smoother (level 1) : damping factor #1 = 7.846258e-01
> Prolongator/Restriction smoother (level 1) : ( = 1.333333e+00 / 1.699324e+00)
>
> ML_Aggregate_Coarsen (level 2) begins
> ML_Aggregate_CoarsenUncoupled : current level = 2
> ML_Aggregate_CoarsenUncoupled : current eps = 7.500000e-03
> Aggregation(UVB) : Total nonzeros = 237238 (Nrows=6722)
> Aggregation(UC) : Phase 0 - no. of bdry pts  = 4447
> Aggregation(UC) : Phase 1 - nodes aggregated = 1547 (6722)
> Aggregation(UC) : Phase 1 - total aggregates = 265
> Aggregation(UC_Phase2_3) : Phase 1 - nodes aggregated = 1547
> Aggregation(UC_Phase2_3) : Phase 1 - total aggregates = 265
> Aggregation(UC_Phase2_3) : Phase 2a- additional aggregates = 2
> Aggregation(UC_Phase2_3) : Phase 2 - total aggregates = 4710
> Aggregation(UC_Phase2_3) : Phase 2 - boundary nodes   = 0
> Aggregation(UC_Phase2_3) : Phase 3 - leftovers = 4443 and singletons = 4443
> Gen_Prolongator (level 2) : Max eigenvalue = 1.667780e+00
>
> Prolongator/Restriction smoother (level 2) : damping factor #1 = 7.994662e-01
> Prolongator/Restriction smoother (level 2) : ( = 1.333333e+00 / 1.667780e+00)
>
> ML_Aggregate_Coarsen (level 3) begins
> ML_Aggregate_CoarsenUncoupled : current level = 3
> ML_Aggregate_CoarsenUncoupled : current eps = 3.750000e-03
> Aggregation(UVB) : Total nonzeros = 131256 (Nrows=4710)
> Aggregation(UC) : Phase 0 - no. of bdry pts  = 4051
> Aggregation(UC) : Phase 1 - nodes aggregated = 387 (4710)
> Aggregation(UC) : Phase 1 - total aggregates = 94
> Aggregation(UC_Phase2_3) : Phase 1 - nodes aggregated = 387
> Aggregation(UC_Phase2_3) : Phase 1 - total aggregates = 94
> Aggregation(UC_Phase2_3) : Phase 2a- additional aggregates = 14
> Aggregation(UC_Phase2_3) : Phase 2 - total aggregates = 4157
> Aggregation(UC_Phase2_3) : Phase 2 - boundary nodes   = 0
> Aggregation(UC_Phase2_3) : Phase 3 - leftovers = 4049 and singletons = 4049
> Gen_Prolongator (level 3) : Max eigenvalue = 1.545570e+00
>
> Prolongator/Restriction smoother (level 3) : damping factor #1 = 8.626805e-01
> Prolongator/Restriction smoother (level 3) : ( = 1.333333e+00 / 1.545570e+00)
>
> ML_Aggregate_Coarsen (level 4) begins
> ML_Aggregate_CoarsenUncoupled : current level = 4
> ML_Aggregate_CoarsenUncoupled : current eps = 1.875000e-03
> Aggregation(UVB) : Total nonzeros = 87447 (Nrows=4157)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation WARNING (1)
> Aggregation(UC) : Phase 0 - no. of bdry pts  = 3287
> Aggregation(UC) : Phase 1 - nodes aggregated = 423 (4157)
> Aggregation(UC) : Phase 1 - total aggregates = 48
> Aggregation(UC_Phase2_3) : Phase 1 - nodes aggregated = 423
> Aggregation(UC_Phase2_3) : Phase 1 - total aggregates = 48
> Aggregation(UC_Phase2_3) : Phase 2a- additional aggregates = 7
> Aggregation(UC_Phase2_3) : Phase 2 - total aggregates = 3300
> Aggregation(UC_Phase2_3) : Phase 2 - boundary nodes   = 0
> Aggregation(UC_Phase2_3) : Phase 3 - leftovers = 3245 and singletons = 3245
> Gen_Prolongator (level 4) : Max eigenvalue = 1.969376e+00
>
> Prolongator/Restriction smoother (level 4) : damping factor #1 = 6.770333e-01
> Prolongator/Restriction smoother (level 4) : ( = 1.333333e+00 / 1.969376e+00)
>
> ML_Aggregate_Coarsen (level 5) begins
> ML_Aggregate_CoarsenUncoupled : current level = 5
> ML_Aggregate_CoarsenUncoupled : current eps = 9.375000e-04
> Aggregation(UVB) : Total nonzeros = 51361 (Nrows=3300)
> Aggregation WARNING (1)
> Segmentation fault
>
>
>
> --
> Physics, UCB390, U. Colorado, Boulder, CO 80309
> cary at colorado.edu, p 303-492-1489, f 303-492-0642, NEW CELL 303-881-8572
>   

-- 
Jonathan J. Hu, mailto:jhu at sandia.gov
Postal address: Sandia National Laboratories
                Mailstop 9159
                PO Box 969, Livermore, CA 94551-0969
Tel / Fax (925) 294-2931 



More information about the Trilinos-Users mailing list