[Trilinos-Users] [EXTERNAL] Muelu issue with exported Tpetra CrsMatrix

C.S. Natarajan csnataraj at gmail.com
Sun Aug 17 17:47:28 MDT 2014


I thought ParMetis fell into the coupled aggregation bucket, however, just
read the SC2000 paper and think I understand better what coupled
aggregation means in the context of ML/Muelu. Thanks for the help and
clarification.


On Sat, Aug 16, 2014 at 2:06 PM, Andrey Prokopenko <aprokop at sandia.gov>
wrote:

> Any ideas on the timeline coupled aggregation might be available in muelu?
>>
> At this point, I'm not sure. This is probably to depend on how many users
> request it. The issue is that it would need a lot of work and time to make
> its performance good on larger core counts.
>
> However, coupled aggregation is different from METIS or ParMETIS based
> one. Those could be added much sooner.
>
> -Andrey
>
>
> On Fri 15 Aug 2014 05:21:21 PM MDT, C.S. Natarajan wrote:
>
>> That did the trick, thank you very much! I have some stuff in ML that
>> I would like to port over to muelu at some point and in ML I do use
>> metis and parmetis (only at significantly coarser levels) so it might
>> be nice to have coupled aggregation. Any ideas on the timeline coupled
>> aggregation might be available in muelu?
>>
>> Thanks again,
>> C.S.N
>>
>>
>> On Fri, Aug 15, 2014 at 11:42 AM, Andrey Prokopenko
>> <aprokop at sandia.gov <mailto:aprokop at sandia.gov>> wrote:
>>
>>     Thank you for bringing this issue to our attention. I think I
>>     found the solution.
>>
>>     Some background: MueLu has two aggregation routines, uncoupled and
>>     coupled. Coupled aggregation may be more robust in the long term
>>     than uncoupled, however at the moment it is not well optimized, is
>>     not well exercised, and should not be used. Looking at your logs,
>>     we realized that there is still a place in MueLu which defaults to
>>     the coupled case.
>>
>>     I pushed the commit fixing that (I also attached the patch, as it
>>     may take some time for it to propagate to public trilinos repo).
>>     Please try it out, and let me know.
>>
>>     -Andrey
>>
>>
>>     On Thu 14 Aug 2014 03:20:40 PM MDT, C.S. Natarajan wrote:
>>
>>
>>         Hi Andrey, thanks for taking a look. Yeah, 512^3 was the first
>>         wall I
>>         hit (256 ^3 seemed to work fine) and I didn't run it through
>>         valgrind.
>>         The output just before the corruption and the trace are below
>>         (compiled with cxx flags -O0 -g)
>>
>>         Cheers,
>>         C.S.N
>>
>>
>>         Level 3
>>         Prolongator smoothing (MueLu::SaPFactory)
>>         Build (MueLu::TentativePFactory)
>>         Build (MueLu::__CoupledAggregationFactory)
>>
>>         Build (MueLu::CoalesceDropFactory)
>>         lightweight wrap = 0
>>         CoalesceDropFactory::Build(): found blockdim=1 from strided
>>         maps. offset=0
>>         Build (MueLu::AmalgamationFactory)
>>         AmalagamationFactory::Build(): found fullblocksize=1 and
>>         stridedblocksize=1 from strided maps. offset=0
>>         CoalesceDropFactory::__SetupAmalgamationData() # of amalgamated
>>         blocks=421729
>>         CoalesceDropFactory: nodeMap 3364/421729 <tel:3364%2F421729>
>>
>>         elements
>>         Detected 0 Dirichlet nodes
>>         Coarsen Uncoupled (MueLu::__LocalAggregationAlgorithm)
>>
>>         Ordering: Natural
>>         Min nodes per aggregate: 1
>>         Max nbrs already selected: 0
>>         Nodes aggregated = 390513 (421729)
>>         Total aggregates = 35554
>>         AggregateLeftovers (MueLu::__LeftoverAggregationAlgorithm)
>>
>>         Phase 1 - nodes aggregated = 420475
>>         Phase 1 - total aggregates = 35554
>>         Phase 3 - additional aggregates = 0
>>         ======= Backtrace: =========
>>         /lib64/libc.so.6(+0x75358)[__0x2b5438a4a358]
>>         /lib64/libc.so.6(+0x7899d)[__0x2b5438a4d99d]
>>         /lib64/libc.so.6(__libc___malloc+0x77)[0x2b5438a4f3e7]
>>         /usr/lib64/libstdc++.so.6(___Znwm+0x1d)[0x2b543857259d]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZN6Xpetra8toXpetraIixN13Kokkos__Classic10SerialNodeEEEKN7Teuch
>> __os3RCPIKNS_3MapIT_T0_T1___EEEERKNS4_IKN6Tpetra3MapIS6___
>> S7_S8_EEEE+0x42)[0xd5097a]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK6Xpetra17TpetraMultiVectorI__dixN13KokkosClassic10SerialNod
>> __eEE6getMapEv+0x65)[0xb748c9]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu28CoupledAggregationC__ommHelperIixN13KokkosClassic10
>> __SerialNodeENS1___12AltSparseOpsIviS2_NS1___
>> 7details28AltSparseOpsDefaultA__llocatorIiS2___
>> EEEEE23ArbitrateAndCommunicate__ERN6Xpetra6VectorIdixS2___
>> EERNSA_IiixS2_EEPSD_b+0xd7)[__0xb46863]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu28CoupledAggregationC__ommHelperIixN13KokkosClassic10
>> __SerialNodeENS1___12AltSparseOpsIviS2_NS1___
>> 7details28AltSparseOpsDefaultA__llocatorIiS2___
>> EEEEE23ArbitrateAndCommunicate__ERN6Xpetra6VectorIdixS2_
>> EERNS___10AggregatesIixS2_S7_EEb+0x89)__[0xb46785]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu28LeftoverAggregation__AlgorithmIixN13KokkosClassic10
>> __SerialNodeENS1___12AltSparseOpsIviS2_NS1___
>> 7details28AltSparseOpsDefaultA__llocatorIiS2___
>> EEEEE18AggregateLeftoversERKNS___9GraphBaseIixS2_S7_EERNS___
>> 10AggregatesIixS2_S7_EE+__0x5e1c)[0xb42d80]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu25CoupledAggregationF__actoryIixN13KokkosClassic10Ser
>> __ialNodeENS1___12AltSparseOpsIviS2_NS1___7details28AltSparseOpsDefaultA
>> __llocatorIiS2_EEEEE5BuildERNS___5LevelE+0x60c)[0xb459ac]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu22SingleLevelFactoryB__ase9CallBuildERNS_5LevelE+__
>> 0x39)[0xad08e9]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZN5MueLu5Level3GetIN7Teuchos3R__CPINS___10AggregatesIixN13KokkosClassi
>> __c10SerialNodeENS5___12AltSparseOpsIviS6_NS5___
>> 7details28AltSparseOpsDefaultA__llocatorIiS6_EEEEEEEEEERT___
>> RKSsPKNS_11FactoryBaseE+0xbb6)__[0xac201e]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu7Factory3GetIN7Teucho__s3RCPINS___10AggregatesIixN13KokkosClassi
>> __c10SerialNodeENS5___12AltSparseOpsIviS6_NS5___
>> 7details28AltSparseOpsDefaultA__llocatorIiS6_EEEEEEEEEET_
>> RNS___5LevelERKSs+0x79)[0xacede5]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu17TentativePFactoryId__ixN13KokkosClassic10SerialNode
>> __ENS1_12AltSparseOpsIviS2_NS1___7details28AltSparseOpsDefaultA
>> __llocatorIiS2_EEEEE6BuildPERNS___5LevelESA_+0x28f)[0xb01feb]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu17TentativePFactoryId__ixN13KokkosClassic10SerialNode
>> __ENS1_12AltSparseOpsIviS2_NS1___7details28AltSparseOpsDefaultA
>> __llocatorIiS2_EEEEE5BuildERNS___5LevelESA_+0x45)[0xb01d55]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu19TwoLevelFactoryBase__9CallBuildERNS_5LevelE+0x3fa)[__0xad5dd6]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZN5MueLu5Level3GetIN7Teuchos3R__CPIN6Xpetra6MatrixIdixN13Kokko
>> __sClassic10SerialNodeENS6___12AltSparseOpsIviS7_NS6___
>> 7details28AltSparseOpsDefaultA__llocatorIiS7_EEEEEEEEEERT___
>> RKSsPKNS_11FactoryBaseE+0xbb6)__[0xabdf52]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu10SaPFactoryIdixN13Ko__kkosClassic10SerialNodeENS1_
>> __12AltSparseOpsIviS2_NS1___7details28AltSparseOpsDefaultA__llocatorIiS2_
>> EEEEE6BuildPERNS___5LevelESA_+0x5af)[0xb101ab]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu10SaPFactoryIdixN13Ko__kkosClassic10SerialNodeENS1_
>> __12AltSparseOpsIviS2_NS1___7details28AltSparseOpsDefaultA__llocatorIiS2_
>> EEEEE5BuildERNS___5LevelESA_+0x45)[0xb0fbf5]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu19TwoLevelFactoryBase__9CallBuildERNS_5LevelE+0x3fa)[__0xad5dd6]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZN5MueLu5Level3GetIN7Teuchos3R__CPIN6Xpetra6MatrixIdixN13Kokko
>> __sClassic10SerialNodeENS6___12AltSparseOpsIviS7_NS6___
>> 7details28AltSparseOpsDefaultA__llocatorIiS7_EEEEEEEEEERT___
>> RKSsPKNS_11FactoryBaseE+0xbb6)__[0xabdf52]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZNK5MueLu13TopRAPFactoryIdixN1__3KokkosClassic10SerialNodeENS1
>> ___12AltSparseOpsIviS2_NS1___7details28AltSparseOpsDefaultA
>> __llocatorIiS2_EEEEE5BuildERNS___5LevelESA_+0xd5)[0xaf0879]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZN5MueLu9HierarchyIdixN13Kokko__sClassic10SerialNodeENS1___
>> 12AltSparseOpsIviS2_NS1___7details28AltSparseOpsDefaultA__llocatorIiS2___
>> EEEEE5SetupEiN7Teuchos3PtrIKNS___18FactoryManagerBaseEEESD_
>> SD___+0x1cc5)[0xada501]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(___
>> ZN5MueLu9HierarchyIdixN13Kokko__sClassic10SerialNodeENS1___
>> 12AltSparseOpsIviS2_NS1___7details28AltSparseOpsDefaultA__llocatorIiS2_
>> EEEEE5SetupERKNS___18FactoryManagerBaseEii+__0x111b)[0xadbf2b]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d(__
>> main+0x49ee)[0x71436c]
>>         /lib64/libc.so.6(__libc_start___main+0xe6)[0x2b54389f3c36]
>>         /gpfs07/vol0/BCF/RD/users/__natac0/intrepid/bin/fem_3d[__
>> 0x70b189]
>>
>>
>>
>>         On Thu, Aug 14, 2014 at 3:25 PM, Andrey Prokopenko
>>         <aprokop at sandia.gov <mailto:aprokop at sandia.gov>
>>         <mailto:aprokop at sandia.gov <mailto:aprokop at sandia.gov>>> wrote:
>>
>>         Hi Natarajan,
>>
>>         I'm going to take a look at it. Is 512^3 the smallest reproducible
>>         size? Have you tried running a smaller problem through valgrind?
>>
>>         Could you please also attach the output which happens before the
>>         corruption? That would help us to identify the responsible
>>         factory. It would also be great if you have the stack.
>>
>>         -Andrey
>>
>>
>>         On Thu 14 Aug 2014 12:00:11 PM MDT, C.S. Natarajan wrote:
>>
>>         Hi,
>>         I am running into a strange issue when building the MG
>>         hierarchy
>>         with Muelu. This only happens for large problems. I am pasting a
>>         minimal example that can reproduce the error. The example
>>         assembles a
>>         Laplace operator on a structured grid with Q1 elements with
>>         Dirichlet
>>         boundary condition in the Z direction and Neumann BC's in the
>>         X and Y
>>         direction. In the example there are two global node maps. gmap and
>>         gmap2. gmap is used to build matrix A which has an unequal
>>         distribution of nodes in each processor. gmap2 is used to export
>>         matrix A from gmap to matrix B which contains approximately
>>         the same
>>         number of nodes in each processor. For a problem of size
>>         512^3, while
>>         A builds fine, building the hierarchy with matrix B fails with a
>>         memory corruption error that ends with a MPI_Reduce_scatter
>>         error. I
>>         am using the public git version as of 08/11. The build is with
>>         intel
>>         C++ 14.0.0 and support for long long int is enabled (do i also
>>         need to
>>         enable ilp64 from mkl?)
>>
>>         I would really appreciate any input on what I might be doing
>>         wrong or
>>         whether there is a bug in the Muelu library.
>>
>>
>>         TIA,
>>         C.S.N
>>
>>         #include <sstream>
>>
>>         //Epetra Includes
>>         #include "Epetra_BlockMap.h"
>>         #include "Epetra_Comm.h"
>>         #include "Epetra_MpiComm.h"
>>         #include "Epetra_VbrMatrix.h"
>>         #include "Epetra_SerialDenseMatrix.h"
>>         #include "Epetra_FEVector.h"
>>         #include "Epetra_FEVbrMatrix.h"
>>
>>         // Intrepid includes
>>         #include "Intrepid_ArrayTools.hpp"
>>         #include "Intrepid_FunctionSpaceTools.____hpp"
>>         #include "Intrepid_FieldContainer.hpp"
>>         #include "Intrepid_CellTools.hpp"
>>
>>         #include "Intrepid_HGRAD_HEX_C1_FEM.____hpp"
>>         #include "Intrepid_RealSpaceTools.hpp"
>>         #include "Intrepid_____DefaultCubatureFactory.hpp"
>>         #include "Intrepid_Utils.hpp"
>>         #include <Intrepid_HGRAD_QUAD_C1_FEM.____hpp>
>>
>>
>>         //Sacado Includes
>>         #include "Sacado.hpp"
>>
>>         //Shards Includes
>>         #include "Shards_CellTopology.hpp"
>>
>>         //Teuchos Includes
>>         #include "Teuchos_Array.hpp"
>>         #include "Teuchos_ArrayView.hpp"
>>         #include "Teuchos_ArrayViewDecl.hpp"
>>         #include "Teuchos_BLAS.hpp"
>>         #include "Teuchos_CommandLineProcessor.____hpp"
>>
>>
>>         #include "Teuchos_DefaultMpiComm.hpp"
>>         #include "Teuchos_FancyOStream.hpp"
>>         #include "Teuchos_GlobalMPISession.hpp"
>>         #include "Teuchos_oblackholestream.hpp"
>>         #include "Teuchos_OrdinalTraits.hpp"
>>         #include "Teuchos_RCP.hpp"
>>         #include "Teuchos_ScalarTraits.hpp"
>>         #include "Teuchos_SerialDenseMatrix.____hpp"
>>
>>
>>         #include "Teuchos_ArrayRCPDecl.hpp"
>>
>>         //Lone survivor from Tpetra
>>         #include "Tpetra_DefaultPlatform.hpp"
>>         #include "Tpetra_Version.hpp"
>>         #include "Tpetra_BlockMap_decl.hpp"
>>         #include "Tpetra_BlockMap.hpp"
>>         #include "Tpetra_Vector.hpp"
>>         #include "Tpetra_VbrMatrix.hpp"
>>         #include "Tpetra_DistObject_decl.hpp"
>>         #include "Tpetra_BlockMultiVector.hpp"
>>         #include "Tpetra_BlockMultiVector_decl.____hpp"
>>
>>
>>         #include "Tpetra_MultiVector.hpp"
>>
>>         //Muelu includes
>>         #include "MueLu.hpp"
>>         #include "MueLu_FactoryManager.hpp"
>>         #include "MueLu_DirectSolver.hpp"
>>         #include "MueLu_Hierarchy.hpp"
>>         #include "MueLu_SaPFactory.hpp"
>>         #include "MueLu_GenericRFactory.hpp"
>>         #include "MueLu_RAPFactory.hpp"
>>         #include "MueLu_TentativePFactory.hpp"
>>         #include "MueLu_Ifpack2Smoother.hpp"
>>         #include "MueLu_SmootherFactory.hpp"
>>         #include "MueLu_RigidBodyModeFactory.____hpp"
>>         #include <MueLu_UseDefaultTypes.hpp>
>>         #include <MueLu_____MLParameterListInterpreter.____hpp>
>>
>>         #include "MueLu_Amesos2Smoother.hpp"
>>         #include <BelosConfigDefs.hpp>
>>         #include <BelosLinearProblem.hpp>
>>         #include <BelosBlockCGSolMgr.hpp>
>>         #include <BelosXpetraAdapter.hpp>
>>         #include <BelosMueLuAdapter.hpp>
>>         #include <MueLu_UseShortNames.hpp>
>>
>>
>>
>>
>>         using namespace Intrepid;
>>         typedef double ST;
>>         typedef Teuchos::ScalarTraits<ST> SCT;
>>         typedef SCT::magnitudeType MT;
>>         typedef long long int LLT;
>>         typedef Kokkos::DefaultNode::____DefaultNodeType NO;
>>
>>
>>         typedef Tpetra::CrsMatrix<ST, LO, LLT, NO, LMO> matrix_type;
>>         typedef Tpetra::Map<LO, LLT, NO> map_type;
>>         using Teuchos::RCP;
>>         using Teuchos::rcp;
>>
>>         int main(int argc, char *argv[])
>>         {
>>
>>         using Teuchos::Array;
>>         Teuchos::oblackholestream nullStream;
>>         Teuchos::GlobalMPISession mpiSession(&argc,&argv, NULL);
>>         Epetra_MpiComm ecomm(MPI_COMM_WORLD);
>>
>>         Teuchos::RCP<const Teuchos::Comm<int> > comm =
>>         Tpetra::DefaultPlatform::____getDefaultPlatform().getComm()____;
>>
>>
>>         size_t myRank = comm->getRank();
>>         size_t nProc = comm->getSize();
>>         std::ostream &out = (myRank == 0)?std::cout:nullStream;
>>         Teuchos::RCP<Teuchos::____FancyOStream> fout =
>>         Teuchos::getFancyOStream(____Teuchos::rcpFromRef (out));
>>
>>
>>
>>         Teuchos::CommandLineProcessor clp;
>>
>>         int GNEX = 64;
>>         clp.setOption("NX", &GNEX, "Number of elements in
>>         X-direction.");
>>
>>         int GNEY = 64;
>>         clp.setOption("NY", &GNEY, "Number of elements in
>>         Y-direction.");
>>
>>         int GNEZ = 64;
>>         clp.setOption("NZ", &GNEZ, "Number of elements in
>>         Z-direction.");
>>
>>         long long grow = 128;
>>         clp.setOption("grow", &grow, "Global row number");
>>
>>         std::string matrix("original");
>>         clp.setOption("matrix", &matrix, "original = A,
>>         exported = b");
>>
>>         clp.recogniseAllOptions(true);
>>         clp.throwExceptions(false);
>>
>>         Teuchos::CommandLineProcessor:____:EParseCommandLineReturn
>>
>>
>>         parseReturn = clp.parse(argc, argv);
>>         if(parseReturn ==
>>         Teuchos::CommandLineProcessor:____:PARSE_HELP_PRINTED) return 0;
>>         if(parseReturn !=
>>         Teuchos::CommandLineProcessor:____:PARSE_SUCCESSFUL) return 1;
>>
>>
>>
>>         const int numGElements = GNEX*GNEY*GNEZ;
>>         const int GNNX = GNEX+1, GNNY = GNEY+1, GNNZ= GNEZ+1;
>>         const int numGNodes = GNNX*GNNY*GNNZ;
>>         const int numGDof = GNNX*GNNY*GNNZ;
>>         out << "\t\t Number of global elements is : " <<
>>         numGElements
>>         << "\n";
>>         out << "\t\t Number of global nodes is : " <<
>>         numGNodes <<
>>         "\n";
>>
>>         Epetra_Map elemZMap(GNEZ, 0, ecomm);
>>         long count = GNNZ/nProc;
>>         long np; np = GNNZ - (nProc*count);
>>         count += (myRank < np)?1:0;
>>
>>         long long int nnu = count*GNNX*GNNY;
>>         LO indexBase = 0;
>>         Teuchos::RCP<const Tpetra::Map<LO, LLT> > gmap =
>>         Teuchos::rcp(new Tpetra::Map<LO,LLT>(-1, nnu, indexBase, comm));
>>         int numLNodes = gmap->getNodeNumElements();
>>
>>         int startOverlapZ = (myRank ==
>>         0)?0:(gmap->getGlobalElement(____0))-GNNX*GNNY;
>>
>>
>>
>>         int endOverlapZ = (myRank ==
>>         nProc-1)?(gmap->____getGlobalElement(0)+(count-1)*
>> ____GNNX*GNNY):(gmap->____getGlobalElement(0)+count*____GNNX*GNNY);
>>
>>
>>
>>
>>         startOverlapZ /= (GNNX*GNNY); endOverlapZ /= (GNNX*GNNY);
>>
>>         int startX = 0; int endX = GNNX;
>>
>>         int startUY = (myRank == 0)?
>>         startOverlapZ:(startOverlapZ+____1);
>>
>>
>>         // Unique map start index
>>         int endUY = (myRank ==
>>         nProc-1)?endOverlapZ:(____endOverlapZ-1);
>>
>>
>>         //
>>         Unique map end index
>>
>>         int numLElemZ = elemZMap.NumMyElements();
>>         int numLElemX = GNEX, numLElemY = GNEY;
>>         int numLElems = numLElemX*numLElemY*numLElemZ;
>>         int startElemZ = elemZMap.GID(0);
>>         int endElemZ = elemZMap.GID(numLElemZ-1);
>>         int startNodeZ = startElemZ;
>>         int endNodeZ = endElemZ+1;
>>
>>         typedef shards::CellTopology CellTopology;
>>         CellTopology
>>         h8(shards::____getCellTopologyData<shards::____Hexahedron<8>
>>
>>         >() );
>>
>>         // Get dimensions
>>         int numNodesPerElem = h8.getNodeCount();
>>         int spaceDim = h8.getDimension();
>>
>>         FieldContainer<int> elemNode(numLElems, numNodesPerElem);
>>         int lElem = 0;
>>         for(int k = startElemZ; k <= endElemZ; k++)
>>         for (int j = 0; j < GNEY; j++)
>>         for (int i = 0; i < GNEX; i++)
>>         {
>>         elemNode(lElem,0) =
>>         (GNNX*GNNY)*k +
>>         (GNNX)*j + i;
>>         elemNode(lElem,1) =
>>         (GNNX*GNNY)*k +
>>         (GNNX)*j + i + 1;
>>         elemNode(lElem,2) =
>>         (GNNX*GNNY)*k +
>>         (GNNX)*(j + 1) + i + 1;
>>         elemNode(lElem,3) =
>>         (GNNX*GNNY)*k +
>>         (GNNX)*(j + 1) + i;
>>         elemNode(lElem,4) =
>>         (GNNX*GNNY)*(k+1)
>>         + (GNNX)*j + i;
>>         elemNode(lElem,5) =
>>         (GNNX*GNNY)*(k+1)
>>         + (GNNX)*j + i + 1;
>>         elemNode(lElem,6) =
>>         (GNNX*GNNY)*(k+1)
>>         + (GNNX)*(j + 1) + i + 1;
>>         elemNode(lElem,7) =
>>         (GNNX*GNNY)*(k+1)
>>         + (GNNX)*(j + 1) + i;
>>         lElem++;
>>         }
>>
>>         double leftX = 0.0, rightX = 1.0;
>>         double leftY = 0.0, rightY = 1.0;
>>         double leftZ = 0.0, rightZ = 1.0;
>>
>>         hx = hy = hz = 1;
>>         numLNodes = (numLElemZ+1)*(numLElemY+1)*(____numLElemX+1);
>>
>>
>>         FieldContainer<double> nodeCoord(numLNodes, spaceDim);
>>         FieldContainer<int> nodeOnBoundary(numLNodes);
>>
>>         int inode = 0;
>>
>>         for (int k = startNodeZ; k <= endNodeZ; k++)
>>         for (int j = 0; j < GNNY; j++)
>>         for (int i = 0; i < GNNX; i++)
>>         {
>>         nodeCoord(inode,0) = leftX +
>>         (double)i*hx;
>>         nodeCoord(inode,1) = leftY +
>>         (double)j*hy;
>>         nodeCoord(inode,2) = leftZ +
>>         (double)k*hz;
>>         inode++;
>>         }
>>
>>         DefaultCubatureFactory<double> cubFactory;
>>         int cubDegree = 3;
>>         Teuchos::RCP<Cubature<double> > hexCub =
>>         cubFactory.create(h8,
>>         cubDegree);
>>
>>         int cubDim = hexCub->getDimension();
>>         int numCubPoints = hexCub->getNumPoints();
>>
>>         FieldContainer<double> cubPoints(numCubPoints, cubDim);
>>         FieldContainer<double> cubWeights(numCubPoints);
>>
>>         hexCub->getCubature(cubPoints, cubWeights);
>>
>>         Basis_HGRAD_HEX_C1_FEM<double, FieldContainer<double> >
>>         hexHGradBasis;
>>         int numFieldsG = hexHGradBasis.getCardinality()____;
>>
>>
>>
>>         FieldContainer<double> hexGVals(numFieldsG, numCubPoints);
>>         FieldContainer<double> hexGrads(numFieldsG, numCubPoints,
>>         spaceDim);
>>
>>
>>         hexHGradBasis.getValues(____hexGVals, cubPoints,
>>         OPERATOR_VALUE);
>>         hexHGradBasis.getValues(____hexGrads, cubPoints,
>>
>>
>>         OPERATOR_GRAD);
>>
>>
>>         typedef CellTools<double> CellTools;
>>         typedef FunctionSpaceTools fst;
>>         int numCells = 1;
>>
>>         FieldContainer<double> hexNodes(numCells, numNodesPerElem,
>>         spaceDim);
>>
>>         FieldContainer<double> hexJacobian(numCells, numCubPoints,
>>         spaceDim, spaceDim);
>>         FieldContainer<double> hexJacobInv(numCells, numCubPoints,
>>         spaceDim, spaceDim);
>>         FieldContainer<double> hexJacobDet(numCells,
>>         numCubPoints);
>>
>>         FieldContainer<double> localStiffMatrix(numCells,
>>         numFieldsG,
>>         numFieldsG);
>>         FieldContainer<double> weightedMeasure(numCells,
>>         numCubPoints);
>>         FieldContainer<double> hexGradsTransformed(numCells,
>>         numFieldsG, numCubPoints, spaceDim);
>>         FieldContainer<double>
>>         hexGradsTransformedWeighted(____numCells,
>>
>>
>>         numFieldsG, numCubPoints, spaceDim);
>>
>>         FieldContainer<double> rhsData(numCells, numCubPoints);
>>         FieldContainer<double> localRHS(numCells, numFieldsG);
>>         FieldContainer<double> hexGValsTransformed(numCells,
>>         numFieldsG, numCubPoints);
>>         FieldContainer<double>
>>         hexGValsTransformedWeighted(____numCells,
>>
>>
>>         numFieldsG, numCubPoints);
>>         FieldContainer<double> physCubPoints(numCells,
>>         numCubPoints,
>>         cubDim);
>>
>>         Teuchos::Array<ST> values(8);
>>         Teuchos::Array<long long int> cols(8);
>>         Teuchos::RCP<matrix_type> A = rcp(new
>>         matrix_type(gmap, 27));
>>
>>         long long int rowIndex = 0, colIndex = 0, ierr, ix;
>>         for(int i = 0; i < lElem; i++)
>>         {
>>         for(int j = 0; j < numNodesPerElem; j++)
>>         {
>>
>>         ix = (elemNode(i,j)%GNNX);
>>         hexNodes(0,j,0) = ix*hx;
>>         hexNodes(0,j,1) =
>>         ((elemNode(i,j)%(GNNX*GNNY)
>>         - ix)/GNNX)*hy;
>>         hexNodes(0,j,2) =
>>         (elemNode(i,j)/(GNNX*GNNY))*____hz;
>>         }
>>
>>         CellTools::setJacobian(____hexJacobian, cubPoints,
>>         hexNodes, h8);
>>         CellTools::setJacobianInv(____hexJacobInv,
>>         hexJacobian );
>>         CellTools::setJacobianDet(____hexJacobDet,
>>         hexJacobian );
>>
>>
>>         fst::HGRADtransformGRAD<____double>(hexGradsTransformed,
>>         hexJacobInv, hexGrads);
>>
>>         fst::computeCellMeasure<____double>(weightedMeasure,
>>         hexJacobDet, cubWeights);
>>
>>
>>         fst::multiplyMeasure<double>(____hexGradsTransformedWeighted,
>>         weightedMeasure, hexGradsTransformed);
>>
>>         fst::integrate<double>(____localStiffMatrix,
>>
>>
>>         hexGradsTransformed, hexGradsTransformedWeighted, COMP_BLAS);
>>         for (int row = 0; row < numFieldsG; row++)
>>         {
>>         rowIndex = elemNode(i,row);
>>         for (int col = 0; col < numFieldsG; col++)
>>         {
>>         cols[col] = elemNode(i,col);
>>         values[col] =
>>         localStiffMatrix(0, row,
>>         col);
>>         }
>>         if ((hexNodes(0,row,2) == 0) ||
>>         (hexNodes(0,row,2) == GNEZ))
>>         std::fill(values.begin(),
>>         values.end(), 0);
>>         A->insertGlobalValues(____rowIndex,
>>
>>
>>         cols, values);
>>         }
>>
>>         }
>>
>>         LLT drow = 0, maxind = 0, down = GNNX*GNNY, top =
>>         GNNX*GNNY*(GNNZ-1);
>>         LO num_local_nodes;
>>         maxind = gmap->getMaxGlobalIndex();
>>
>>         num_local_nodes = gmap->getNodeNumElements();
>>         if ((drow < down) || (maxind >= top))
>>         {
>>         for(int i = 0; i < num_local_nodes; i++)
>>         {
>>         drow = gmap->getGlobalElement(i);
>>         if ((drow < down) || (drow >= top))
>>         A->replaceGlobalValues(drow,
>>         Teuchos::tuple<long long int>(drow), Teuchos::tuple<ST>(1.0));
>>         }
>>         }
>>
>>         A->globalAssemble();
>>         A->fillComplete();
>>
>>         RCP<Tpetra::Vector<SC, LO, LLT, NO> > x1 = rcp(new
>>         Tpetra::Vector<SC, LO, LLT, NO> (A->getDomainMap ()));
>>         RCP<Tpetra::Vector<SC, LO, LLT, NO> > b1 = rcp(new
>>         Tpetra::Vector<SC, LO, LLT, NO> (A->getRangeMap ()));
>>         x1->putScalar(1.0);
>>         A->apply(*x1, *b1);
>>         out << "Norm of b1 is " << b1->norm2() << std::endl;
>>
>>
>>         Teuchos::RCP< map_type> gmap2 = rcp (new map_type
>>         (numGNodes,
>>         indexBase, comm, Tpetra::GloballyDistributed));
>>         Teuchos::RCP<matrix_type> B = Teuchos::rcp(new matrix_type
>>         (gmap2, 0));
>>         typedef Tpetra::Export<LO, LLT, NO> export_type;
>>         export_type exporter (gmap, gmap2);
>>         B->doExport (*A, exporter, Tpetra::INSERT);
>>         B->fillComplete();
>>
>>         RCP<Tpetra::Vector<SC, LO, LLT, NO> > x2 = rcp(new
>>         Tpetra::Vector<SC, LO, LLT, NO> (B->getDomainMap ()));
>>         RCP<Tpetra::Vector<SC, LO, LLT, NO> > b2 = rcp(new
>>         Tpetra::Vector<SC, LO, LLT, NO> (B->getRangeMap ()));
>>         x2->putScalar(1.0);
>>         B->apply(*x2, *b2);
>>         out << "Norm of b2 is " << b2->norm2() << std::endl;
>>
>>         Teuchos::Array<ST> crsvals;
>>         size_t NumEntries = 0;
>>         Teuchos::Array<LLT> indices;
>>         NumEntries = B->getNumEntriesInGlobalRow(____grow);
>>         if (NumEntries !=
>>         Teuchos::OrdinalTraits<size_t>____::invalid())
>>
>>
>>         {
>>         std::cout << "outputting crs values " <<
>>         std::endl;
>>         indices.resize(NumEntries);
>>         crsvals.resize(NumEntries);
>>         A->getGlobalRowCopy (grow, indices(), crsvals(),
>>         NumEntries);
>>         for(int i = 0; i < indices.size(); i++)
>>         std::cout << indices[i] << " " <<
>>         crsvals[i]
>>         << "\n";
>>         }
>>
>>         Xpetra::UnderlyingLib lib = Xpetra::UseTpetra;
>>         Teuchos::RCP<Xpetra::____CrsMatrix<ST, LO, LLT, NO, LMO>
>>
>>
>>         > mueluM_;
>>         if (matrix == "original")
>>         mueluM_ = rcp(new Xpetra::TpetraCrsMatrix<ST,
>>         LO, LLT,
>>         NO, LMO>(A));
>>         else
>>         mueluM_ = rcp(new Xpetra::TpetraCrsMatrix<ST,
>>         LO, LLT,
>>         NO, LMO>(B));
>>
>>         Teuchos::RCP<Xpetra::Matrix <ST, LO, LLT, NO, LMO> >
>>         mueluA =
>>         rcp(new Xpetra::CrsMatrixWrap<ST, LO, LLT, NO, LMO>(mueluM_));
>>
>>         Teuchos::RCP<MueLu::Hierarchy<____ST, LO, LLT, NO, LMO>
>>
>>
>>         > H =
>>         rcp(new MueLu::Hierarchy<ST, LO, LLT, NO, LMO> (mueluA));
>>
>>         out << "Begin setup of Heirarchy ..." << std::endl;
>>
>>         H->Setup();
>>
>>         out << "Finished setup of Hierarchy ..." << std::endl;
>>
>>         }
>>
>>
>>         ___________________________________________________
>>         Trilinos-Users mailing list
>>         Trilinos-Users at software.__sand__ia.gov <http://sandia.gov>
>>         <mailto:Trilinos-Users at __software.sandia.gov
>>         <mailto:Trilinos-Users at software.sandia.gov>>
>>         https://software.sandia.gov/____mailman/listinfo/trilinos-__
>> __users
>>         <https://software.sandia.gov/__mailman/listinfo/trilinos-__users>
>>         <https://software.sandia.gov/__mailman/listinfo/trilinos-__users
>>         <https://software.sandia.gov/mailman/listinfo/trilinos-users>>
>>
>>
>>         --
>>         -Andrey
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://software.sandia.gov/pipermail/trilinos-users/attachments/20140817/1970a9b0/attachment.html>


More information about the Trilinos-Users mailing list