[Trilinos-Users] Epetra_FECrsMatrix memory issues

Hoemmen, Mark mhoemme at sandia.gov
Mon Jul 9 08:32:17 MDT 2012


About the Epetra_FECrsMatrix memory issues: I just got back from travel and haven't had a chance to look at the code yet, but does Epetra_FECrsMatrix clear out the std::vector each time?  I've heard that std::vector::clear() doesn't necessarily deallocate memory.  The canonical idiom for forcing deallocation is to do an std::swap with an empty std::vector, and let the now-not-empty vector fall out of scope:

{
  std::vector<T> empty;
  std::swap (empty, v_);
}

mfh


________________________________________
From: trilinos-users-bounces at software.sandia.gov [trilinos-users-bounces at software.sandia.gov] on behalf of trilinos-users-request at software.sandia.gov [trilinos-users-request at software.sandia.gov]
Sent: Wednesday, July 04, 2012 12:00 PM
To: trilinos-users at software.sandia.gov
Subject: Trilinos-Users Digest, Vol 83, Issue 1

Send Trilinos-Users mailing list submissions to
        trilinos-users at software.sandia.gov

To subscribe or unsubscribe via the World Wide Web, visit
        http://software.sandia.gov/mailman/listinfo/trilinos-users
or, via email, send a message with subject or body 'help' to
        trilinos-users-request at software.sandia.gov

You can reach the person managing the list at
        trilinos-users-owner at software.sandia.gov

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Trilinos-Users digest..."


Today's Topics:

   1. Epetra_FECrsMatrix memory issues? (Nico Schl?mer)
   2. Re: [EXTERNAL]  Epetra_FECrsMatrix memory issues?
      (Williams, Alan B)
   3. Re: [EXTERNAL]  Epetra_FECrsMatrix memory issues? (Nico Schl?mer)
   4. Re: [EXTERNAL]  Epetra_FECrsMatrix memory issues? (Nico Schl?mer)
   5. moar memory leaks/obsessive allocations (Nico Schl?mer)
   6. Re: Epetra_FECrsMatrix memory issues? (Bart Janssens)
   7. Re: moar memory leaks/obsessive allocations (Bartlett, Roscoe A.)
   8. Re: moar memory leaks/obsessive allocations (Nico Schl?mer)


----------------------------------------------------------------------

Message: 1
Date: Tue, 3 Jul 2012 21:07:10 +0200
From: Nico Schl?mer <nico.schloemer at gmail.com>
Subject: [Trilinos-Users] Epetra_FECrsMatrix memory issues?
To: trilinos-users at software.sandia.gov
Message-ID:
        <CAK6Z60eh6spUhX0kmoYMM6d+SEa_X0AyV-epdxz2pY6ZwBp7XA at mail.gmail.com>
Content-Type: text/plain; charset=iso-8859-1

Hi all,

I recently discovered that, for large-scale problems, my (LOCA-based)
application code consumes more and more memory as the computation goes
along, eventually resulting in an ungraceful exit of it all. I now did
a memory profile of the code and, surprisingly, a large chunk of the
memory that is consumed are "std::vectors", supposedly allocated by
Epetra_FECrsMatrix::InputNonlocalValue().

The design of the code is such that there is an Epetra_FECrsMatrix
factory with a cache of one matrix (all of the matrices in my code
have the same graph). Upon calling the giveMeAMatrix() for this
factory, the cache gets (conditionally) recreated and is subsequently
copied out. It seems that this recalculate process eats more memory as
more and more matrices are computed.
I put up a memory profile for anyone to look at
<http://win.ua.ac.be/~nschloe/other/massif.out.3902>, e.g., using the
massif-visualizer (screen shot of my application view here
<http://win.ua.ac.be/~nschloe/other/massif-vis.png>).

Anyone with similar issues here, or an idea for a remedy?

Cheers,
Nico



------------------------------

Message: 2
Date: Tue, 3 Jul 2012 19:26:24 +0000
From: "Williams, Alan B" <william at sandia.gov>
Subject: Re: [Trilinos-Users] [EXTERNAL]  Epetra_FECrsMatrix memory
        issues?
To: "'nico.schloemer at gmail.com'" <nico.schloemer at gmail.com>,
        "'trilinos-users at software.sandia.gov'"
        <trilinos-users at software.sandia.gov>
Message-ID:
        <D25EE3DA6BA6E24F98F326A3E17796CA23853886 at EXMB01.srn.sandia.gov>
Content-Type: text/plain; charset="iso-8859-1"

Sounds like a leak. I'll look into it.
Alan


----- Original Message -----
From: Nico Schl?mer [mailto:nico.schloemer at gmail.com]
Sent: Tuesday, July 03, 2012 01:07 PM
To: trilinos-users at software.sandia.gov <trilinos-users at software.sandia.gov>
Subject: [EXTERNAL] [Trilinos-Users] Epetra_FECrsMatrix memory issues?

Hi all,

I recently discovered that, for large-scale problems, my (LOCA-based)
application code consumes more and more memory as the computation goes
along, eventually resulting in an ungraceful exit of it all. I now did
a memory profile of the code and, surprisingly, a large chunk of the
memory that is consumed are "std::vectors", supposedly allocated by
Epetra_FECrsMatrix::InputNonlocalValue().

The design of the code is such that there is an Epetra_FECrsMatrix
factory with a cache of one matrix (all of the matrices in my code
have the same graph). Upon calling the giveMeAMatrix() for this
factory, the cache gets (conditionally) recreated and is subsequently
copied out. It seems that this recalculate process eats more memory as
more and more matrices are computed.
I put up a memory profile for anyone to look at
<http://win.ua.ac.be/~nschloe/other/massif.out.3902>, e.g., using the
massif-visualizer (screen shot of my application view here
<http://win.ua.ac.be/~nschloe/other/massif-vis.png>).

Anyone with similar issues here, or an idea for a remedy?

Cheers,
Nico

_______________________________________________
Trilinos-Users mailing list
Trilinos-Users at software.sandia.gov
http://software.sandia.gov/mailman/listinfo/trilinos-users



------------------------------

Message: 3
Date: Wed, 4 Jul 2012 00:27:12 +0200
From: Nico Schl?mer <nico.schloemer at gmail.com>
Subject: Re: [Trilinos-Users] [EXTERNAL]  Epetra_FECrsMatrix memory
        issues?
To: "Williams, Alan B" <william at sandia.gov>
Cc: "trilinos-users at software.sandia.gov"
        <trilinos-users at software.sandia.gov>
Message-ID:
        <CAK6Z60fXwyDKDAFLAPySd+4xBsQnoy-EnaKtDCzo1AWLzUOmgg at mail.gmail.com>
Content-Type: text/plain; charset=iso-8859-1

Alright so Jeremie's leak fix in Epetra_CrsMatrix earlier this week
does *not* address the issue.
Looking at the output of Valgrind's massif, it seems that the culprit is

0x88769C: std::vector<double, std::allocator<double>
>::_M_insert_aux(__gnu_cxx::__normal_iterator<double*,
std::vector<double, std::allocator<double> > >, double const&)
36.7 MB

in Epetra_FECrsMatrix::InputNonlocalValue. -- An iterator, really?

--Nico



On Tue, Jul 3, 2012 at 9:26 PM, Williams, Alan B <william at sandia.gov> wrote:
> Sounds like a leak. I'll look into it.
> Alan
>
>
> ----- Original Message -----
> From: Nico Schl?mer [mailto:nico.schloemer at gmail.com]
> Sent: Tuesday, July 03, 2012 01:07 PM
> To: trilinos-users at software.sandia.gov <trilinos-users at software.sandia.gov>
> Subject: [EXTERNAL] [Trilinos-Users] Epetra_FECrsMatrix memory issues?
>
> Hi all,
>
> I recently discovered that, for large-scale problems, my (LOCA-based)
> application code consumes more and more memory as the computation goes
> along, eventually resulting in an ungraceful exit of it all. I now did
> a memory profile of the code and, surprisingly, a large chunk of the
> memory that is consumed are "std::vectors", supposedly allocated by
> Epetra_FECrsMatrix::InputNonlocalValue().
>
> The design of the code is such that there is an Epetra_FECrsMatrix
> factory with a cache of one matrix (all of the matrices in my code
> have the same graph). Upon calling the giveMeAMatrix() for this
> factory, the cache gets (conditionally) recreated and is subsequently
> copied out. It seems that this recalculate process eats more memory as
> more and more matrices are computed.
> I put up a memory profile for anyone to look at
> <http://win.ua.ac.be/~nschloe/other/massif.out.3902>, e.g., using the
> massif-visualizer (screen shot of my application view here
> <http://win.ua.ac.be/~nschloe/other/massif-vis.png>).
>
> Anyone with similar issues here, or an idea for a remedy?
>
> Cheers,
> Nico
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
>




------------------------------

Message: 4
Date: Wed, 4 Jul 2012 09:23:53 +0200
From: Nico Schl?mer <nico.schloemer at gmail.com>
Subject: Re: [Trilinos-Users] [EXTERNAL]  Epetra_FECrsMatrix memory
        issues?
To: "Williams, Alan B" <william at sandia.gov>
Cc: "trilinos-users at software.sandia.gov"
        <trilinos-users at software.sandia.gov>
Message-ID:
        <CAK6Z60f8jqyj31=zXHV8wkOnnxRfxJvJBXqi6MLJ=sY1WbMOKQ at mail.gmail.com>
Content-Type: text/plain; charset=iso-8859-1

I think I now see what the issue is.
Epetra_FECrsMatrix::InputNonlocalValue() contains a number of
std::vector<double>::insert() statements working on class variables,
effectively extending the vectors every time this method is called.
This becomes an issue if the same matrix is filled a number of times
(as is the case with my application).
I'm not sure about why one sees the large memory allocation in the
std::vector<>::iterator after all, but this has been observed before
<https://savannah.cern.ch/bugs/?31968>.
It looks very much like this has been introduced in
72ed0256d81640b7374e5da028900daca9a86016 last February.

Additionally,
grep "insert(" ./packages/epetra/src/
shows that there are a number of other calls to insert in methods
which may be called a couple of times in a row, so those will probably
need to be checked.

--Nico




On Wed, Jul 4, 2012 at 12:27 AM, Nico Schl?mer <nico.schloemer at gmail.com> wrote:
> Alright so Jeremie's leak fix in Epetra_CrsMatrix earlier this week
> does *not* address the issue.
> Looking at the output of Valgrind's massif, it seems that the culprit is
>
> 0x88769C: std::vector<double, std::allocator<double>
>>::_M_insert_aux(__gnu_cxx::__normal_iterator<double*,
> std::vector<double, std::allocator<double> > >, double const&)
> 36.7 MB
>
> in Epetra_FECrsMatrix::InputNonlocalValue. -- An iterator, really?
>
> --Nico
>
>
>
> On Tue, Jul 3, 2012 at 9:26 PM, Williams, Alan B <william at sandia.gov> wrote:
>> Sounds like a leak. I'll look into it.
>> Alan
>>
>>
>> ----- Original Message -----
>> From: Nico Schl?mer [mailto:nico.schloemer at gmail.com]
>> Sent: Tuesday, July 03, 2012 01:07 PM
>> To: trilinos-users at software.sandia.gov <trilinos-users at software.sandia.gov>
>> Subject: [EXTERNAL] [Trilinos-Users] Epetra_FECrsMatrix memory issues?
>>
>> Hi all,
>>
>> I recently discovered that, for large-scale problems, my (LOCA-based)
>> application code consumes more and more memory as the computation goes
>> along, eventually resulting in an ungraceful exit of it all. I now did
>> a memory profile of the code and, surprisingly, a large chunk of the
>> memory that is consumed are "std::vectors", supposedly allocated by
>> Epetra_FECrsMatrix::InputNonlocalValue().
>>
>> The design of the code is such that there is an Epetra_FECrsMatrix
>> factory with a cache of one matrix (all of the matrices in my code
>> have the same graph). Upon calling the giveMeAMatrix() for this
>> factory, the cache gets (conditionally) recreated and is subsequently
>> copied out. It seems that this recalculate process eats more memory as
>> more and more matrices are computed.
>> I put up a memory profile for anyone to look at
>> <http://win.ua.ac.be/~nschloe/other/massif.out.3902>, e.g., using the
>> massif-visualizer (screen shot of my application view here
>> <http://win.ua.ac.be/~nschloe/other/massif-vis.png>).
>>
>> Anyone with similar issues here, or an idea for a remedy?
>>
>> Cheers,
>> Nico
>>
>> _______________________________________________
>> Trilinos-Users mailing list
>> Trilinos-Users at software.sandia.gov
>> http://software.sandia.gov/mailman/listinfo/trilinos-users
>>




------------------------------

Message: 5
Date: Wed, 4 Jul 2012 10:12:55 +0200
From: Nico Schl?mer <nico.schloemer at gmail.com>
Subject: [Trilinos-Users] moar memory leaks/obsessive allocations
To: trilinos-users at software.sandia.gov
Message-ID:
        <CAK6Z60d5C4JCXTmnd=tGZWWb2=ENTeEHRJKEB2N4v4p2ijyGBw at mail.gmail.com>
Content-Type: text/plain; charset=iso-8859-1

Hi all,

with the big chunk of the Epetra_FE*Matrix allocations gone, I other
leak-like effects surface. Here's one with potentially larger impact,
Teuchos::reduceAll (which is called by virtually all dot-products).

http://win.ua.ac.be/~nschloe/other/mass2.png
http://win.ua.ac.be/~nschloe/other/massif.out.39989

Again, over time, the methods builds up more and more memory. It's
less clear to me how this happens though as reduceAll() itself is
const. It appears to be the create_contiguous call, and hence the
numBytes argument to reduceAll. From then on it's digging where those
calls are made.

--Nico



------------------------------

Message: 6
Date: Wed, 4 Jul 2012 11:11:48 +0200
From: "Bart Janssens" <bart.janssens at lid.kviv.be>
Subject: Re: [Trilinos-Users] Epetra_FECrsMatrix memory issues?
To: trilinos-users at software.sandia.gov
Message-ID:
        <CAJoBg_WM4SL7TamRVnsytLVLSZcnCUNOQ3PcACUw8x68WJ_W9A at mail.gmail.com>
Content-Type: text/plain; charset=iso-8859-1

On Tue, Jul 3, 2012 at 9:07 PM, Nico Schl?mer <nico.schloemer at gmail.com> wrote:
> I put up a memory profile for anyone to look at
> <http://win.ua.ac.be/~nschloe/other/massif.out.3902>, e.g., using the
> massif-visualizer (screen shot of my application view here
> <http://win.ua.ac.be/~nschloe/other/massif-vis.png>).
>
> Anyone with similar issues here, or an idea for a remedy?

Hi guys,

I'm seeing similar issues with Epetra_FEVbrMatrix, more and more
memory is used the longer the simulation runs. Memory profiling only
shows allocations from Trilinos, so I'm guessing this is the same
issue. As an additional hint, this only happens in parallel runs,
serial runs show no memory increase at all. We are running Trilinos
10.10.1.

Kind regards,

--
Bart




------------------------------

Message: 7
Date: Wed, 4 Jul 2012 08:55:03 -0400
From: "Bartlett, Roscoe A." <bartlettra at ornl.gov>
Subject: Re: [Trilinos-Users] moar memory leaks/obsessive allocations
To: 'Nico Schl?mer' <nico.schloemer at gmail.com>,
        "'trilinos-users at software.sandia.gov'"
        <trilinos-users at software.sandia.gov>
Message-ID:
        <7D79831ADC03DE47A3626F74554229CF0E8BE241A4 at EXCHMB.ornl.gov>
Content-Type: text/plain; charset=utf-8

The reduceall leak is fixed in the master branch and the latest release branch.

Sent from my Android phone.


-----Original Message-----
From: Nico Schl?mer [nico.schloemer at gmail.com<mailto:nico.schloemer at gmail.com>]
Sent: Wednesday, July 04, 2012 04:14 AM Eastern Standard Time
To: trilinos-users at software.sandia.gov
Subject: [Trilinos-Users] moar memory leaks/obsessive allocations


Hi all,

with the big chunk of the Epetra_FE*Matrix allocations gone, I other
leak-like effects surface. Here's one with potentially larger impact,
Teuchos::reduceAll (which is called by virtually all dot-products).

http://win.ua.ac.be/~nschloe/other/mass2.png
http://win.ua.ac.be/~nschloe/other/massif.out.39989

Again, over time, the methods builds up more and more memory. It's
less clear to me how this happens though as reduceAll() itself is
const. It appears to be the create_contiguous call, and hence the
numBytes argument to reduceAll. From then on it's digging where those
calls are made.

--Nico

_______________________________________________
Trilinos-Users mailing list
Trilinos-Users at software.sandia.gov
http://software.sandia.gov/mailman/listinfo/trilinos-users




------------------------------

Message: 8
Date: Wed, 4 Jul 2012 17:49:41 +0200
From: Nico Schl?mer <nico.schloemer at gmail.com>
Subject: Re: [Trilinos-Users] moar memory leaks/obsessive allocations
To: "Bartlett, Roscoe A." <bartlettra at ornl.gov>
Cc: "trilinos-users at software.sandia.gov"
        <trilinos-users at software.sandia.gov>
Message-ID:
        <CAK6Z60dgVYWjxGAiew_XTO2sms_rRE4tBwjj5uM+cwrNkspFrg at mail.gmail.com>
Content-Type: text/plain; charset=iso-8859-1

Confirmed. Thanks!


On Wed, Jul 4, 2012 at 2:55 PM, Bartlett, Roscoe A. <bartlettra at ornl.gov> wrote:
> The reduceall leak is fixed in the master branch and the latest release branch.
>
> Sent from my Android phone.
>
>
> -----Original Message-----
> From: Nico Schl?mer [nico.schloemer at gmail.com<mailto:nico.schloemer at gmail.com>]
> Sent: Wednesday, July 04, 2012 04:14 AM Eastern Standard Time
> To: trilinos-users at software.sandia.gov
> Subject: [Trilinos-Users] moar memory leaks/obsessive allocations
>
>
> Hi all,
>
> with the big chunk of the Epetra_FE*Matrix allocations gone, I other
> leak-like effects surface. Here's one with potentially larger impact,
> Teuchos::reduceAll (which is called by virtually all dot-products).
>
> http://win.ua.ac.be/~nschloe/other/mass2.png
> http://win.ua.ac.be/~nschloe/other/massif.out.39989
>
> Again, over time, the methods builds up more and more memory. It's
> less clear to me how this happens though as reduceAll() itself is
> const. It appears to be the create_contiguous call, and hence the
> numBytes argument to reduceAll. From then on it's digging where those
> calls are made.
>
> --Nico
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
>




------------------------------

_______________________________________________
Trilinos-Users mailing list
Trilinos-Users at software.sandia.gov
http://software.sandia.gov/mailman/listinfo/trilinos-users


End of Trilinos-Users Digest, Vol 83, Issue 1
*********************************************



More information about the Trilinos-Users mailing list