ElmerSolver_mpi stuck in a loop

General discussion about Elmer
spacedout
Posts: 177
Joined: 30 Mar 2020, 23:27
Antispam: Yes

ElmerSolver_mpi stuck in a loop

Post by spacedout »

Good morning

I think the code is running in an infinite loop when I execute

mpirun -np 2 ElmerSolver_mpi case.sif

where case.sif contains

Solver 1
Equation = "potential"
Variable = -global Whatever

Exported Variable 1 = -global setflag

Exec Solver = Always
Procedure = "volt" "voltage"
End

Solver 2
Exec Condition = Equals setflag
Equation = "results"
Procedure = "ResultOutputSolve" "ResultOutputSolver"
Output File Name = "parav"
Vtu Format = Logical True
Single Precision = Logical True ! double precision is the default
Scalar Field 1 = String Potential
Vector Field 1 = String Velocity
End

and where volt.F90 contains

SUBROUTINE voltage( Model,Solver,dt,TransientSimulation )

..........
IF( ParEnv % MyPe /= 0 )RETURN

setfgVar => VariableGet( Solver % Mesh % Variables, 'setflag' )

...........

setfgVar % Values(1) = 1.0

...........
END SUBROUTINE voltage


This does not happen with
mpirun -np 1 ElmerSolver_mpi case.sif
or more simply
ElmerSolver case.sif

I am not sure how to go about debugging file ResultOutputSolve.F90 with gdb or any debugger for that matter.

All comments appreciated
Marc
kevinarden
Posts: 2237
Joined: 25 Jan 2019, 01:28
Antispam: Yes

Re: ElmerSolver_mpi stuck in a loop

Post by kevinarden »

Did you first partition the mesh to 2 partitions and it partitioned without errors. Then did you update the sif to point to the new partitioned mesh?
raback
Site Admin
Posts: 4812
Joined: 22 Aug 2009, 11:57
Antispam: Yes
Location: Espoo, Finland
Contact:

Re: ElmerSolver_mpi stuck in a loop

Post by raback »

Hi

When you return from some MPI process do you consider that the others may want to sync and are still waiting...

-Peter
spacedout
Posts: 177
Joined: 30 Mar 2020, 23:27
Antispam: Yes

Re: ElmerSolver_mpi stuck in a loop

Post by spacedout »

I did

ElmerGrid 2 2 meshdirname -partdual -metiskway 2

and output shows no errors and claims it was successful
I can see the 2 partitions under subfolder partitioning.2 of folder meshdirname

If case .sif contains something like

Header
Mesh DB "." "meshdirname/partitioning.2"
End

the program aborts immediately and you observe a warning about a non-existent partition 2


Therefore I stick with using

Header
Mesh DB "." "meshdirname"
End


Now in more detail, my file volt.F90 also contains fields variables:

SUBROUTINE voltage( Model,Solver,dt,TransientSimulation )

..........

IF( ParEnv % MyPe /= 0 )RETURN

setfgVar => VariableGet( Solver % Mesh % Variables, 'setflag' )

...........

setfgVar % Values(1) = 1.0

...........


DO i=1,Model % NumberOfNodes

j = fehdPerm(i)
IF( j == 0 ) CYCLE

DO k=1,DIM

aveFEHD(DIM*(j-1)+k) = 0.0

END DO

END DO

..........
END SUBROUTINE voltage

I presume

mpirun -np 2 ElmerSolver_mpi case.sif

will use subfolder partitioning.2 to find the mesh and that only one processor (ParEnv % MyPe = 0 ) knows what to do with the above loop over the entire mesh. So all field variables are taken care of by one processor and the other processor does not need to do anything at all.

You can of course correct me if I am wrong in my assumptions
raback
Site Admin
Posts: 4812
Joined: 22 Aug 2009, 11:57
Antispam: Yes
Location: Espoo, Finland
Contact:

Re: ElmerSolver_mpi stuck in a loop

Post by raback »

Hi

In MPI all processes typically carry out their own task on their own piece of data. Communication must be done explicitely using MPI commands. Partition 0 roughly owns half of the mesh and partition 1 the rest.

Maybe you could add

Code: Select all

Max Output Level = 20
Max Output Partition = 2
to get more data from both processes to see where the code freezes.

-Peter
kevinarden
Posts: 2237
Joined: 25 Jan 2019, 01:28
Antispam: Yes

Re: ElmerSolver_mpi stuck in a loop

Post by kevinarden »

It has been awhile since I coded and compiled mpi programs, but I remember having to use an mpi compiler or setting flags to get mpi to work for a program. I have not tested a user subroutine using elmerf90 with mpi. Perhaps elmerf90 handles it, or the technology has moved on from my previous experience.

I tried the same with one of my user subroutines compiled with elmerf90 and it worked fine with no issues, So the above does not appear to be the issue.
Last edited by kevinarden on 19 Mar 2021, 23:20, edited 1 time in total.
kevinarden
Posts: 2237
Joined: 25 Jan 2019, 01:28
Antispam: Yes

Re: ElmerSolver_mpi stuck in a loop

Post by kevinarden »

If you want to share the mesh, subroutine, and sif file, I can do an independent check.
spacedout
Posts: 177
Joined: 30 Mar 2020, 23:27
Antispam: Yes

Re: ElmerSolver_mpi stuck in a loop

Post by spacedout »

For case.sif, I added

Max Output Level = 20
Max Output Partition = 2

in the Simulation section

and with its

Solver 1
Equation = "potential"
Variable = -global Whatever

Exported Variable 1 = -global setflag

Exec Solver = Always
Procedure = "volt" "voltage"
End


Solver 2
Exec Condition = Equals setflag
Equation = "result vtu"
Procedure = "ResultOutputSolve" "ResultOutputSolver"
Output File Name = "parav"
Vtu Format = Logical True
Single Precision = Logical True ! double precision is the default
Scalar Field 1 = String Potential
Vector Field 1 = String Velocity

..........

End

and my volt.F90 now simply
setfgVar => VariableGet( Solver % Mesh % Variables, 'setflag' )
setfgVar % Values(1) = 1.0

IF( ParEnv % MyPe /= 0 )RETURN

setfgVar % Values(1) = -1.0

RETURN

the results are

UpdateDependentObjects: Part1: Updating objects depending on primary field in steady state
DerivateExportedVariables: Part1: Derivating variables, if any!
UpdateDependentObjects: Part0: Updating objects depending on primary field in steady state
DerivateExportedVariables: Part0: Derivating variables, if any!

---- now program is frozen

whereas

setfgVar => VariableGet( Solver % Mesh % Variables, 'setflag' )
setfgVar % Values(1) = -1.0

IF( ParEnv % MyPe /= 0 )RETURN

setfgVar % Values(1) = 1.0

RETURN

the results are

UpdateDependentObjects: Part1: Updating objects depending on primary field in steady state
DerivateExportedVariables: Part1: Derivating variables, if any!
UpdateDependentObjects: Part0: Updating objects depending on primary field in steady state
DerivateExportedVariables: Part0: Derivating variables, if any!
SetActiveElementsTable: Part0: Creating active element table for: result vtu
SetActiveElementsTable: Part0: Number of active elements found : 8029

---- now program is frozen

However if volt.F90 is reduced to
setfgVar => VariableGet( Solver % Mesh % Variables, 'setflag' )
setfgVar % Values(1) = 1.0

RETURN
then the program does not freeze.

Quite fantastic! you would think setflag is a global variable but it is as if there is a separate setflag variable for each processor.

Also I am not sure how easy it is to change volt.F90 to incorporate MPI communications of the sort I saw inside MainUtils.F90

Have a nice weekend
spacedout
Posts: 177
Joined: 30 Mar 2020, 23:27
Antispam: Yes

Re: ElmerSolver_mpi stuck in a loop

Post by spacedout »

Good day kevinarden

I have attached a barebones mesh and program. Download all 3 files in the same folder and execute

ElmerGrid 1 2 rect.grd

ElmerGrid 2 2 rect -partdual -metiskway 2

elmerf90 volt.F90 -o volt.so

mpirun -np 2 ElmerSolver_mpi case.sif

within that folder.

The program will freeze almost right away. And of course if you comment out line

IF( ParEnv % MyPe /= 0 )RETURN

in volt.F90, the program runs normally.

These two lines in the simulation section of case.sif, as suggested by Peter, are useful in debugging.

Max Output Level = 20
Max Output Partition = 2

Have a nice end of weekend
Marc
Attachments
volt.F90
(668 Bytes) Downloaded 149 times
rect.grd
(373 Bytes) Downloaded 153 times
case.sif
(2.17 KiB) Downloaded 167 times
kevinarden
Posts: 2237
Joined: 25 Jan 2019, 01:28
Antispam: Yes

Re: ElmerSolver_mpi stuck in a loop

Post by kevinarden »

It happens on my system exactly as you describe.
Post Reply