[Trilinos-Users] Sending the resulting eigenvector to master node

Mehmet Salih YILDIRIM linux at isadamlari.org
Fri Jun 4 06:19:20 MDT 2010


Hello,

I tried to debug the program with gdb (I don't know how useful it would be
since the program is supposed to be run with mpirun normally) and i got the
following results when i backtrace:

(gdb) backtrace

#0  0x0000003f7b552c38 in main_arena () from /lib64/libc.so.6

#1  0x00000000004c8d8f in Epetra_MultiVector::~Epetra_MultiVector ()

#2  0x0000000000481765 in Teuchos::RCPNodeHandle::unbindOne ()

#3  0x0000000000432262 in ~BasicEigenproblem (this=0x18c0aeb0) at
> /usr/local/include/Teuchos_RCPNode.hpp:907

#4  0x0000000000481765 in Teuchos::RCPNodeHandle::unbindOne ()

#5  0x000000000040ee8e in DistNCut::Image::doSegmentate (this=0x18bef670,
> segment=0x7fff6ef8dfd0, output=0x18bef6a0)

    at /usr/local/include/Teuchos_RCPNode.hpp:907

#6  0x000000000040e898 in DistNCut::Image::doSegmentate (this=0x18bef670,
> segment=0x7fff6ef8e700, output=0x18bef6a0) at DistNCut.cpp:436

#7  0x000000000040fc38 in DistNCut::Image::segmentate (this=0x18bef670) at
> DistNCut.cpp:223

#8  0x0000000000474806 in main (argc=1, argv=0x7fff6ef8e8a8) at main.cpp:33

(gdb)


I don't know why the RCPNodeHandle::unbindOne() is invoked (and it calls the
destructor of BasicEigenproblem surprisingly, even though BasicEigenproblem
was not pointed by some RCP object.) from my function Image::doSegmentate
since i didn't call it in my code.

Any ideas about why the segmentation fault happens after seeing this
backtrace?
Any help is appreciated.

Best regards.
Mehmet Salih YILDIRIM

On Thu, Jun 3, 2010 at 8:07 PM, Mehmet Salih YILDIRIM
<linux at isadamlari.org>wrote:

> Hi,
>
> Well, I commented out the Multiply method, however nothing changed.
>
> I was unable to solve the eigenproblem before, so i decided to send a
> random vector to master node and go on, so, when i produce a random vector,
> everything works ok, but the problem is only when i try to send the
> resulting eigenvector, i'm sending the code snippet again, with the
> producing random vector part, so maybe it would be more clear:
>
>     int numMyElements;
>
>
>>     if (Comm.MyPID() == 0)
>
>         numMyElements = segment->pixelIndices.size();
>
>     else
>
>         numMyElements = 0;
>
>
>>     Epetra_Map targetMap(-1, numMyElements, 0, Comm);
>
>
>>     Epetra_Vector eigenVector(targetMap); //will be collected on the
>> master node.
>
>
>>     Teuchos::RCP<Epetra_Vector> vectorToSend;
>
>
>>     const Epetra_BlockMap *sourceMap;
>
>     if (solverReturn == Anasazi::Converged) {
>
>         Anasazi::Eigensolution<ST, MV> solution =
>> generalizedProblem->getSolution();
>
>
>>         int numEigenPairs = solution.numVecs;
>
>
>>         Teuchos::RCP<MV> evecs = solution.Evecs;
>
>         std::vector<Anasazi::Value<ST> > evals = solution.Evals;
>
>
>>         int ourEvIdx = 0;
>
>         if (numEigenPairs == 2) //send the second smaller eigenvector
>
>             if (evals[0].realpart < evals[1].realpart)
>
>                 ourEvIdx = 1;
>
>             else
>
>                 ourEvIdx = 0;
>
>
>>         //evecs->Multiply(1,*invSqrtD, *evecs, 0.0); commented this one
>> out, nothing changed.
>
>         vectorToSend = Teuchos::rcp((*evecs)(ourEvIdx));
>
>         sourceMap = &vectorToSend->Map();
>
>     } else {
>
>         vectorToSend = Teuchos::rcp(new Epetra_Vector(DminusW->RowMap()));
>
>         vectorToSend->Random();
>
>         sourceMap = &vectorToSend->Map();
>
>     Epetra_Import importer(targetMap, *sourceMap);
>
>     eigenVector.Import(*vectorToSend, importer, Add); // send eigenvector
>> to processor 1
>
>
>
> Well, the error message is, as follows again:
>
> *** glibc detected *** ./opencv_deneme: corrupted double-linked list:
>> 0x000000001a254ec0 ***
>
> *** glibc detected *** ./opencv_deneme: corrupted double-linked list:
>> 0x000000001d68fa00 ***
>
> ======= Backtrace: =========
>
> /lib64/libc.so.6[0x3f7b2723e5]
>
> /lib64/libc.so.6(cfree+0x4b)[0x3f7b27273b]
>
> ./opencv_deneme[0x4c930f]
>
> ./opencv_deneme[0x481ce5]
>
> ./opencv_deneme[0x4358b2]
>
> ./opencv_deneme[0x481ce5]
>
> ./opencv_deneme[0x40f139]
>
> ./opencv_deneme[0x40eacf]
>
> ./opencv_deneme[0x40eacf]
>
> ./opencv_deneme[0x4102f4]
>
> ./opencv_deneme[0x474d86]
>
> /lib64/libc.so.6(__libc_start_main+0xf4)[0x3f7b21d994]
>
> ./opencv_deneme(__gxx_personality_v0+0x299)[0x40ba89]
>
> ======= Memory map: ========
>
> 00400000-0051e000 r-xp 00000000 03:01 9864086
>>  /home/proje/trilinos_ile_ilk_deneme/opencv
>> deneme/dist/Debug/GNU-Linux-x86/opencv_deneme
>
> 0071d000-00724000 rw-p 0011d000 03:01 9864086
>>  /home/proje/trilinos_ile_ilk_deneme/opencv
>> deneme/dist/Debug/GNU-Linux-x86/opencv_deneme
>
> 1d61a000-1d722000 rw-p 1d61a000 00:00 0
>>  [heap]
>
> 3a68600000-3a689f6000 r-xp 00000000 03:01 6687839
>>  /usr/lib64/atlas/liblapack.so.3.0
>
> 3a689f6000-3a68bf6000 ---p 003f6000 03:01 6687839
>>  /usr/lib64/atlas/liblapack.so.3.0
>
> 3a68bf6000-3a68bf9000 rw-p 003f6000 03:01 6687839
>>  /usr/lib64/atlas/liblapack.so.3.0
>
> 3a68bf9000-3a68cfd000 rw-p 3a68bf9000 00:00 0
>
> 3a69600000-3a69cec000 r-xp 00000000 03:01 6687846
>>  /usr/lib64/atlas/libatlas.so.3.0
>
> 3a69cec000-3a69eeb000 ---p 006ec000 03:01 6687846
>>  /usr/lib64/atlas/libatlas.so.3.0
>
> 3a69eeb000-3a69ef5000 rw-p 006eb000 03:01 6687846
>>  /usr/lib64/atlas/libatlas.so.3.0
>
> 3a6c200000-3a6c254000 r-xp 00000000 03:01 5380294
>>  /usr/lib64/libblas.so.3.0.3
>
> 3a6c254000-3a6c453000 ---p 00054000 03:01 5380294
>>  /usr/lib64/libblas.so.3.0.3
>
> 3a6c453000-3a6c454000 rw-p 00053000 03:01 5380294
>>  /usr/lib64/libblas.so.3.0.3
>
> 3a6ca00000-3a6ca1e000 r-xp 00000000 03:01 6687833
>>  /usr/lib64/atlas/libcblas.so.3.0
>
> 3a6ca1e000-3a6cc1e000 ---p 0001e000 03:01 6687833
>>  /usr/lib64/atlas/libcblas.so.3.0
>
> 3a6cc1e000-3a6cc1f000 rw-p 0001e000 03:01 6687833
>>  /usr/lib64/atlas/libcblas.so.3.0
>
> 3a6ce00000-3a6ce1c000 r-xp 00000000 03:01 6687835
>>  /usr/lib64/atlas/libf77blas.so.3.0
>
> 3a6ce1c000-3a6d01c000 ---p 0001c000 03:01 6687835
>>  /usr/lib64/atlas/libf77blas.so.3.0
>
>
> Any help still would be appreciated :)
>
> Best regards,
> Mehmet Salih YILDIRIM
>
> On Thu, Jun 3, 2010 at 1:58 AM, Thornquist, Heidi K <hkthorn at sandia.gov>wrote:
>
>>  Hi Mehmet,
>>
>> If you are getting a seg fault that traces back to LAPACK, then that would
>> incriminate the “Multiply” method.  If you comment out
>> that line, does that help?  Also, I’m not sure how safe it is to call
>> multiply on the same multivector that you are including in the arguments and
>> also
>> asking to be zeroed out.
>>
>> Thanks,
>> Heidi
>>
>>
>>
>>
>> On 6/2/10 2:02 PM, "Mehmet Salih YILDIRIM" <linux at isadamlari.org> wrote:
>>
>> Hey!
>>
>> I'd been trying to send the resulting eigenvector to the master node (that
>> is, with MyPID() =0) and i wrote the following code:
>>
>>  int numMyElements;
>>
>>     if (Comm.MyPID() == 0)
>>         numMyElements = segment->pixelIndices.size(); //that is, the total
>> number of elements of an eigenvector (also the row count of operator of
>> eigenproblem)
>>     else
>>         numMyElements = 0;
>>
>>     Epetra_Map targetMap(-1, numMyElements, 0, Comm);
>>
>>     Epetra_Vector eigenVector(targetMap); //will be collected on the
>> master node.
>>
>>     Teuchos::RCP<Epetra_Vector> vectorToSend;
>>
>>     const Epetra_BlockMap *sourceMap;
>>     if (solverReturn == Anasazi::Converged) {
>>         Anasazi::Eigensolution<ST, MV> solution =
>> generalizedProblem->getSolution();
>>
>>         int numEigenPairs = solution.numVecs;
>>
>>         Teuchos::RCP<MV> evecs = solution.Evecs;
>>         std::vector<Anasazi::Value<ST> > evals = solution.Evals;
>>
>>         int ourEvIdx = 0;
>>         if (numEigenPairs == 2)
>>             if (evals[0].realpart < evals[1].realpart)
>>                 ourEvIdx = 1;
>>         evecs->Multiply(1,*invSqrtD, *evecs, 0.0);
>>         vectorToSend = Teuchos::rcp((*evecs)(ourEvIdx));
>>         sourceMap = &vectorToSend->Map();
>>     }
>>
>>     Epetra_Import importer(targetMap, *sourceMap);
>>
>>     eigenVector.Import(*vectorToSend, importer, Add); // send eigenvector
>> to processor 1
>>
>>
>> However i get a fault like : (something like segmentation fault i think)
>>
>> *** glibc detected *** ./opencv_deneme: corrupted double-linked list:
>> 0x000000002006cec0 ***
>> ======= Backtrace: =========
>> /lib64/libc.so.6[0x3f7b2723e5]
>> /lib64/libc.so.6(cfree+0x4b)[0x3f7b27273b]
>> ./opencv_deneme[0x4c976f]
>> ./opencv_deneme[0x482145]
>> ./opencv_deneme[0x42a292]
>> ./opencv_deneme[0x482145]
>> ./opencv_deneme[0x40f0be]
>> ./opencv_deneme[0x40ea52]
>> ./opencv_deneme[0x40ea52]
>> ./opencv_deneme[0x410294]
>> ./opencv_deneme[0x4751e6]
>> /lib64/libc.so.6(__libc_start_main+0xf4)[0x3f7b21d994]
>> ./opencv_deneme(__gxx_personality_v0+0x299)[0x40b9d9]
>> ======= Memory map: ========
>> 00400000-0051e000 r-xp 00000000 03:01 9864086
>>  /home/proje/trilinos_ile_ilk_deneme/opencv
>> deneme/dist/Debug/GNU-Linux-x86/opencv_deneme
>> 0071e000-00725000 rw-p 0011e000 03:01 9864086
>>  /home/proje/trilinos_ile_ilk_deneme/opencv
>> deneme/dist/Debug/GNU-Linux-x86/opencv_deneme
>> 1fff8000-20100000 rw-p 1fff8000 00:00 0
>>  [heap]
>> 3a68600000-3a689f6000 r-xp 00000000 03:01 6687839
>>  /usr/lib64/atlas/liblapack.so.3.0
>> 3a689f6000-3a68bf6000 ---p 003f6000 03:01 6687839
>>  /usr/lib64/atlas/liblapack.so.3.0
>> 3a68bf6000-3a68bf9000 rw-p 003f6000 03:01 6687839
>>  /usr/lib64/atlas/liblapack.so.3.0
>> 3a68bf9000-3a68cfd000 rw-p 3a68bf9000 00:00 0
>> 3a69600000-3a69cec000 r-xp 00000000 03:01 6687846
>>  /usr/lib64/atlas/libatlas.so.3.0
>> 3a69cec000-3a69eeb000 ---p 006ec000 03:01 6687846
>>  /usr/lib64/atlas/libatlas.so.3.0
>> 3a69eeb000-3a69ef5000 rw-p 006eb000 03:01 6687846
>>  /usr/lib64/atlas/libatlas.so.3.0
>> 3a6c200000-3a6c254000 r-xp 00000000 03:01 5380294
>>  /usr/lib64/libblas.so.3.0.3
>> 3a6c254000-3a6c453000 ---p 00054000 03:01 5380294
>>  /usr/lib64/libblas.so.3.0.3
>> 3a6c453000-3a6c454000 rw-p 00053000 03:01 5380294
>>  /usr/lib64/libblas.so.3.0.3
>> 3a6ca00000-3a6ca1e000 r-xp 00000000 03:01 6687833
>>  /usr/lib64/atlas/libcblas.so.3.0
>>
>> and so on.
>>
>>
>>
>> So how may i solve this problem? I couldn't figure out where I am
>> mistaken.
>>
>> Any help is appreciated,
>> Mehmet Salih YILDIRIM
>>
>>
>>
>>
>
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> http://software.sandia.gov/mailman/listinfo/trilinos-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://software.sandia.gov/pipermail/trilinos-users/attachments/20100604/b2bb9772/attachment-0001.html 


More information about the Trilinos-Users mailing list