[Trilinos-Users] Odd Behavior in EpetraExt_HDF5

Truman Ellis truman at ices.utexas.edu
Mon Sep 15 16:56:49 MDT 2014


There isn't any distributed data in this example. I just wanted two mpi 
processes to simultaneously write out two independent HDF5 files. But I 
noticed that if the two HDF5 files were different sizes (1000 data items 
vs 2000 data items) then I got a stall. If they both write data of the 
same size, then everything goes through.

On 9/15/14, 4:10 PM, Jonathan Hu wrote:
>> I am using the EpetraExt_HDF5 interface to save and load solutions, but
>> I've run into some odd behavior and was wondering if anyone could
>> explain it. My goal is to have each processor write out its own part of
>> the solution in a different HDF5 file. For the time being, I am assuming
>> that the number of processors loading the solution is equal to the
>> number writing it. Since each processor is completely independent, I
>> shouldn't get any weird race conditions or anything like that
>> (theoretically). In order to communicate this to EpetraExt, I am using a
>> Epetra_SerialComm in the constructor. However, the following code hangs
>> when I run with 2 mpi nodes
>>
>>
>> {
>>     int commRank = Teuchos::GlobalMPISession::getRank();
>>     Epetra_SerialComm Comm;
>>     EpetraExt::HDF5 hdf5(Comm);
>>     hdf5.Create("file"+Teuchos::toString(commRank)+".h5");
>>     vector<double> testVec;
>>     for (int i=0; i<1000+1000*commRank; i++)
>>     {
>>       testVec.push_back(1.0);
>>     }
>>     hdf5.Write("Test", "Group", H5T_NATIVE_DOUBLE, testVec.size(),
>> &testVec[0]);
>> }
>> {
>>     int commRank = Teuchos::GlobalMPISession::getRank();
>>     Epetra_SerialComm Comm;
>>     EpetraExt::HDF5 hdf5(Comm);
>>     hdf5.Open("file"+Teuchos::toString(commRank)+".h5");
>>     hdf5.Close();
>> }
>>
>> Note that commRank 0 writes 1000 elements while commRank 1 writes 2000.
>> The code works just fine when both write the same number of elements.
>> Can someone enlighten me on what I am doing wrong? Is it possible to get
>> the behavior I want, where each processor's read and write is
>> independent of others?
>>
>> Thanks,
>> Truman Ellis
> Truman,
>
>     Rank 1 is loading/writing testVec from from 0..2000 due to the 
> bounds in your for loop.  I'm guessing that you want rank 1 to load 
> from 1001..2000 instead, so replace
>
>    for (int i=0; i<1000+1000*commRank; i++)
>
> with
>
>    for (int i=1000*commRank; i<1000+1000*commRank; i++)
>
> Hope this helps.
>
> Jonathan
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> https://software.sandia.gov/mailman/listinfo/trilinos-users
>



More information about the Trilinos-Users mailing list