[Trilinos-Users] Odd Behavior in EpetraExt_HDF5
Truman Ellis
truman at ices.utexas.edu
Mon Sep 15 16:56:49 MDT 2014
There isn't any distributed data in this example. I just wanted two mpi
processes to simultaneously write out two independent HDF5 files. But I
noticed that if the two HDF5 files were different sizes (1000 data items
vs 2000 data items) then I got a stall. If they both write data of the
same size, then everything goes through.
On 9/15/14, 4:10 PM, Jonathan Hu wrote:
>> I am using the EpetraExt_HDF5 interface to save and load solutions, but
>> I've run into some odd behavior and was wondering if anyone could
>> explain it. My goal is to have each processor write out its own part of
>> the solution in a different HDF5 file. For the time being, I am assuming
>> that the number of processors loading the solution is equal to the
>> number writing it. Since each processor is completely independent, I
>> shouldn't get any weird race conditions or anything like that
>> (theoretically). In order to communicate this to EpetraExt, I am using a
>> Epetra_SerialComm in the constructor. However, the following code hangs
>> when I run with 2 mpi nodes
>>
>>
>> {
>> int commRank = Teuchos::GlobalMPISession::getRank();
>> Epetra_SerialComm Comm;
>> EpetraExt::HDF5 hdf5(Comm);
>> hdf5.Create("file"+Teuchos::toString(commRank)+".h5");
>> vector<double> testVec;
>> for (int i=0; i<1000+1000*commRank; i++)
>> {
>> testVec.push_back(1.0);
>> }
>> hdf5.Write("Test", "Group", H5T_NATIVE_DOUBLE, testVec.size(),
>> &testVec[0]);
>> }
>> {
>> int commRank = Teuchos::GlobalMPISession::getRank();
>> Epetra_SerialComm Comm;
>> EpetraExt::HDF5 hdf5(Comm);
>> hdf5.Open("file"+Teuchos::toString(commRank)+".h5");
>> hdf5.Close();
>> }
>>
>> Note that commRank 0 writes 1000 elements while commRank 1 writes 2000.
>> The code works just fine when both write the same number of elements.
>> Can someone enlighten me on what I am doing wrong? Is it possible to get
>> the behavior I want, where each processor's read and write is
>> independent of others?
>>
>> Thanks,
>> Truman Ellis
> Truman,
>
> Rank 1 is loading/writing testVec from from 0..2000 due to the
> bounds in your for loop. I'm guessing that you want rank 1 to load
> from 1001..2000 instead, so replace
>
> for (int i=0; i<1000+1000*commRank; i++)
>
> with
>
> for (int i=1000*commRank; i<1000+1000*commRank; i++)
>
> Hope this helps.
>
> Jonathan
> _______________________________________________
> Trilinos-Users mailing list
> Trilinos-Users at software.sandia.gov
> https://software.sandia.gov/mailman/listinfo/trilinos-users
>
More information about the Trilinos-Users
mailing list