Parallel Computing made easier with internal partitioning

Numerical methods and mathematical models of Elmer
Post Reply
raback
Site Admin
Posts: 4802
Joined: 22 Aug 2009, 11:57
Antispam: Yes
Location: Espoo, Finland
Contact:

Parallel Computing made easier with internal partitioning

Post by raback »

Hi All,

Since quite recently Elmer includes an interface to Zoltan library which provides partitioning and repartitioning routines (thanx Joe & Juhani). For standard use cases this can provide a more straight-forward way of parallel computing with MPI. Now the serial mesh can be loaded by a master process and the mesh is internally distributed to other parallel tasks.

To use internal partitioning with Zoltan you need two additional commands in the Simulation section:

Code: Select all

  Partition Mesh = Logical True
  Partitioning Method = String "Zoltan"
Then you can run with

Code: Select all

  mpirun -np #np ElmerSolver_mpi 
and the number of partitions will follow automatically #np. This is the only place where you need to set the partitions.

In order to use Zoltan you need a version that has been compiled with that. Zoltan is available in a git subrepo as a cmake project. To include it in your own build say in the source tree:

Code: Select all

git submodule sync
git submodule update --init
and edit your build script to include

Code: Select all

-DWITH_Zoltan:BOOL=TRUE 
The current launchpad version comes with Zoltan, the Windows version does not.

Ideally this is suited cases where the master process does not introduce a CPU time or memory bottle-neck. Also, halo elements are not yet communicated and no special constraints (related to BCs, for example) are considered. So the old predistributed approach is still often the best choice.

All comments and experiences are welcome!

-Peter
RuslanHVAC
Posts: 1
Joined: 24 Feb 2019, 23:37
Antispam: Yes

Re: Parallel Computing made easier with internal partitioning

Post by RuslanHVAC »

Hello Peter!
I use Launchpad version (v.8.4 from 2019-02-15) on CAE Linux 2018.
In fact partitioning with Zoltan don't work.
My SIF-file and LOG from terminal are in attachment.
Please, help, what should I do to use Zoltan.

Thank you!
Attachments
Terminal LOG.txt
(3.66 KiB) Downloaded 306 times
caseEDT.sif
(5.96 KiB) Downloaded 313 times
raback
Site Admin
Posts: 4802
Joined: 22 Aug 2009, 11:57
Antispam: Yes
Location: Espoo, Finland
Contact:

Re: Parallel Computing made easier with internal partitioning

Post by raback »

Hi

Zoltan should be activated on compile time with "WITH_ZOLTAN". Obviously the version is too old to have that. And if you compile manually remember to set it yourself.

-Peter
alexbrown
Posts: 24
Joined: 14 Jul 2020, 11:03
Antispam: Yes

Re: Parallel Computing made easier with internal partitioning

Post by alexbrown »

Hi Peter,

I managed to build elmer under windows with Zoltan, but when i tried to run some case in parallel, the below error message occured:

Code: Select all

 Elements in partition:           1       13025
 Elements in partition:           2       13042
 Elements in partition:           3       12467
 Elements in partition:           4       11745
 Elements in partition:           5        9619
 Elements in partition:           6       11426
 Elements in partition:           7       12237
 Elements in partition:           8       13200
 k right out of bounds:           5       10412       45623           0       10403          15       87751
 k right out of bounds:           5       10416       35297           0       10403          15       87751
 k right out of bounds:           6       11232        3316           0       11164           8       87753
 k right out of bounds:           6       11233        4019           0       11164           8       87753
 k right out of bounds:           6       11251       18678           0       11164           8       87753
ERROR:: UnpackMeshPieces: Encountered 2 indexing issues in nodes
STOP 1
ERROR:: UnpackMeshPieces: Encountered 3 indexing issues in nodes
STOP 1
 k right out of bounds:           7       12045       26352           0       12034           3       87748
ERROR:: UnpackMeshPieces: Encountered 1 indexing issues in nodes
STOP 1
 k right out of bounds:           1       11967        4220           0       11883           6       87738
 k right out of bounds:           1       11968        3257           0       11883           6       87738
 k right out of bounds:           1       11971        3199           0       11883           6       87738
ERROR:: UnpackMeshPieces: Encountered 3 indexing issues in nodes
STOP 1

job aborted:
[ranks] message

[0] terminated

[1] process exited without calling finalize

[2-4] terminated

[5-7] process exited without calling finalize
raback wrote: 13 Jan 2020, 23:54 Hi

Zoltan should be activated on compile time with "WITH_ZOLTAN". Obviously the version is too old to have that. And if you compile manually remember to set it yourself.

-Peter
raback
Site Admin
Posts: 4802
Joined: 22 Aug 2009, 11:57
Antispam: Yes
Location: Espoo, Finland
Contact:

Re: Parallel Computing made easier with internal partitioning

Post by raback »

Hi

We should see the full case to be able to debug it.

Can you do it the traditional way i.e.

Code: Select all

ElmerGrid 2 2 mesh -metiskway 8 -partdual
mpirun -np 8 ElmerSolver_mpi 
-Peter
alexbrown
Posts: 24
Joined: 14 Jul 2020, 11:03
Antispam: Yes

Re: Parallel Computing made easier with internal partitioning

Post by alexbrown »

Hi Peter,

I tried to use the traditional way, the mesh partioning is working, but when running the analysis using mpiexec -np ElmerSolver_mpi, some errors occur:
LoadMesh: Requested mesh > ./mesh/. < does not exist!

I suspect this may due to my incorrect usage, do you have any example mpi run cases that i can refer to?

Further to this quesiton, i am a bit struggled in select the direct or iterative solver for analysis, from time to time, especially when the DOF is large, i got the error "segment fault ..." or "Umf4num=-1.0000000" etc. When i use another computer to run the case or run it at another time, it may suddently working well. :( Do you have any kind suggestion which solver option maybe most robust for relative large problems? Shall i use Mumps or SuperLU?

Alex


raback wrote: 04 Aug 2020, 12:33 Hi

We should see the full case to be able to debug it.

Can you do it the traditional way i.e.

Code: Select all

ElmerGrid 2 2 mesh -metiskway 8 -partdual
mpirun -np 8 ElmerSolver_mpi 
-Peter
raback
Site Admin
Posts: 4802
Joined: 22 Aug 2009, 11:57
Antispam: Yes
Location: Espoo, Finland
Contact:

Re: Parallel Computing made easier with internal partitioning

Post by raback »

Hi

The "mesh" was used for the name of the mesh directory. For you it could be something else...

Of direct linear solvers: Umfpack is not parallel. If you have MUMPS you can use that in also in parallel.

The Krylov methods (iterative methods) should work in parallel. The may be sensitive to preconditioning but there is no inherent parallel limitation.

-Peter
Post Reply