[Trilinos-Users] Odd Behavior in EpetraExt_HDF5

Jonathan Hu jhu at sandia.gov
Tue Sep 16 12:08:49 MDT 2014


trilinos-users-request at software.sandia.gov wrote on 09/16/2014 11:00 AM:
> Subject:
> Re: [Trilinos-Users] Odd Behavior in EpetraExt_HDF5
> From:
> Truman Ellis <truman at ices.utexas.edu>
> Date:
> 09/15/2014 03:56 PM
>
> To:
> <trilinos-users at software.sandia.gov>
>
>
> There isn't any distributed data in this example. I just wanted two 
> mpi processes to simultaneously write out two independent HDF5 files. 
> But I noticed that if the two HDF5 files were different sizes (1000 
> data items vs 2000 data items) then I got a stall. If they both write 
> data of the same size, then everything goes through.
>
> On 9/15/14, 4:10 PM, Jonathan Hu wrote:
>>> I am using the EpetraExt_HDF5 interface to save and load solutions, but
>>> I've run into some odd behavior and was wondering if anyone could
>>> explain it. My goal is to have each processor write out its own part of
>>> the solution in a different HDF5 file. For the time being, I am 
>>> assuming
>>> that the number of processors loading the solution is equal to the
>>> number writing it. Since each processor is completely independent, I
>>> shouldn't get any weird race conditions or anything like that
>>> (theoretically). In order to communicate this to EpetraExt, I am 
>>> using a
>>> Epetra_SerialComm in the constructor. However, the following code hangs
>>> when I run with 2 mpi nodes
>>>
>>>
>>> {
>>>     int commRank = Teuchos::GlobalMPISession::getRank();
>>>     Epetra_SerialComm Comm;
>>>     EpetraExt::HDF5 hdf5(Comm);
>>>     hdf5.Create("file"+Teuchos::toString(commRank)+".h5");
>>>     vector<double> testVec;
>>>     for (int i=0; i<1000+1000*commRank; i++)
>>>     {
>>>       testVec.push_back(1.0);
>>>     }
>>>     hdf5.Write("Test", "Group", H5T_NATIVE_DOUBLE, testVec.size(),
>>> &testVec[0]);
>>> }
>>> {
>>>     int commRank = 
>>> Teuchos::Global_______________________________________________
>>> Trilinos-Users mailing list
>>> Trilinos-Users at software.sandia.gov
>>> https://software.sandia.gov/mailman/listinfo/trilinos-usersMPISession::getRank(); 
>>>
>>>     Epetra_SerialComm Comm;
>>>     EpetraExt::HDF5 hdf5(Comm);
>>>     hdf5.Open("file"+Teuchos::toString(commRank)+".h5");
>>>     hdf5.Close();
>>> }
>>>
>>> Note that commRank 0 writes 1000 elements while commRank 1 writes 2000.
>>> The code works just fine when both write the same number of elements.
>>> Can someone enlighten me on what I am doing wrong? Is it possible to 
>>> get
>>> the behavior I want, where each processor's read and write is
>>> independent of others?
>>>
>>> Thanks,
>>> Truman Ellis
>> Truman,
>>
>>     Rank 1 is loading/writing testVec from from 0..2000 due to the 
>> bounds in your for loop.  I'm guessing that you want rank 1 to load 
>> from 1001..2000 instead, so replace
>>
>>    for (int i=0; i<1000+1000*commRank; i++)
>>
>> with
>>
>>    for (int i=1000*commRank; i<1000+1000*commRank; i++)
>>
>> Hope this helps.
>>
>> Jonathan
>>
Truman,

   Ok, I completely misunderstood your original email.  Hopefully one of 
the I/O developers can chime in here.

Jonathan



More information about the Trilinos-Users mailing list