List of keywords, by module

Rich_B · Post by **Rich_B** » 09 Dec 2021, 20:19

Recently there was a comment saying it would be nice to have a list of Elmer keywords, grouped by solver module.

In issue #201,
https://github.com/ElmerCSC/elmerfem/issues/201
there was this note:

kinnala commented on Oct 16, 2020
Seems that the easiest way to get a list of solver keywords is to apply ripgrep on elmerfem/fem/src/modules:
/src/elmerfem/elmer/fem/src/modules$ rg GetLogical
...
contrib/ShellMultiSolver/ShellMultiSolver.F90
257: LargeDeflection = GetLogical( SolverParams, 'Large Deflection', GotIt )
277: StressComputation = GetLogical( SolverParams, 'StressComputation', GotIt )
279: GetLogical( SolverParams, 'Calculate Stresses', GotIt )

It was easy to run, a copy of the text output is attached.

Maybe it was too easy, is there something missing or something that would make this list incomplete or misleading?

Edit: Turns out one needs to run ripgrep on these two directories, in order to get more of the keywords. The attached archive contains the results from both directories.

/src/elmerfem/elmer/fem/src
/src/elmerfem/elmer/fem/elmerice

Apparently this method returns a large number of keywords defined in code that uses the 'GetLogical' approach. Keywords that are defined using other methods (keywords such as 'Cartesian' or 'Axi Symmetric') will not be listed, so keep that in mind.

Rich.

wiesi · Post by **wiesi** » 19 Dec 2021, 22:04

I think it would be nice to have some static code analysis tool that generates a list of all implemented keywords, so I dug a bit further.

There are many more functions for retrieving the different data types from the sif file. Most seem to be implemented in Lists.F90 and DefUtils.F90. Additional sif file processing using direct string processing can be found in ModelDescription.F90.

The following is out of curiosity: What's the difference between the functions without the "List" prefix (e.g. GetString) and those with "List" prefix (e.g. ListGetString)? GetString just calls ListGetString, but there is some additional code in other functions without the List-prefix. What's the purpose of the non-List-prefixed functions compared to the List-prefixed-functions? And when should which variety be used when?

raback · Post by **raback** » 20 Dec 2021, 03:03

Hi

The purpose of DefUtils.F90 is to make the writing of basic PDE modules easier for the layman. So it is a collection of the most important routines wrapped with some default values for paramaters. For example, solving linear system is hidden behind "DefaultSolve()".

The Lists.F90 includes much more than just the basic List operations. So sometimes the basic wrapper of DefUtils is not enough.

-Peter

Rich_B · Post by **Rich_B** » 21 Dec 2021, 18:49

Hello,

Regarding a static code analysis program, there might be two major use cases.

First case, run the program from the command line without parameters and it will create a list of all indexed keywords, showing the source module for each keyword. This list should be similar to the result from using ripgrep. It would be nice to present the keywords by module in alphabetical order, removing any duplication.

Second case, run the program from the command line with a solver name as a parameter (such as FluxSolver.F90), and it will create a list of keywords specific to the solver. This would be very useful when someone is working with a specific solver and needs to know all the relevant keywords from that solver.

Since the program would need to read through all the files in several directories, it would probably be sensible to write the program in C instead of Fortran.

Just my two cents!

Rich.

wiesi · Post by **wiesi** » 22 Dec 2021, 21:28

The second case is easy, once the first case works. I've started to implement some proof of concept using Python, but there are some difficulties:

1) Some keywords have "computed keyword names". For example, if the variable name in a solver section is selected via Variable = "xyz", the corresponding BCs may be xyz = value. It is difficult to get such keywords in a form usable for an end user as it requires to (really) parse Fortran code.

2) Somewhat related to 1): There are some code locations that do not use "special functions" like GetString() to read the keywords, just normal Fortran string processing. A piece of code in ModelDescription.F90 is

Code: Select all

...
        ELSE IF ( Name == 'echo on' ) THEN
          Echo = .TRUE.
        ELSE IF ( Name == 'echo off' ) THEN
           Echo = .FALSE.
        ELSE IF ( Name == 'numbering on' ) THEN
           Numbering = .TRUE.
...

which is impossible to find with simple grepping and regexes without prior knowledge.

As noted by Peter in the github issue linked in the initial post, there is the compilation flag "DEVEL_KEYWORDMISSES". It writes missing keywords to a file. However, from looking at the source code and a quick test, it does this only for the solvers used at runtime and is therefore not suited for generating a list of all keywords missing in SOLVER.KEYWORDS.

It does not seem to be possible to automatically and realiably extract all keywords with reasonable effort from the source code.

An idea to get 100% coverage is to reject all keywords not listed in SOLVER.KEYWORDS with an error in the SIF parsing code. This would urge all developers to immediately insert the keywords when they are introduced or they wouldn't even be able to run tests. To make this work, the SOLVER.KEYWORDS syntax would have to be augmented to cover at least all the currently necessary "computed keyword" cases, so it is not that easy. Ideally, the documentation of the keywords is also inserted into the SOLVER.KEYWORDS file. Generating the keyword documentation sections in the LaTeX manual source, syntax highlight and autocomplete data for editors and IDEs would then be rather straightforward.

wiesi · Post by **wiesi** » 25 Dec 2021, 21:44

A first version of the tool is ready. It generates HTML output to make some markup possible. I've attached the output for the src/modules/MagnetoDynamics subdirectory if you are interested (had to put it in an archive, because the forum software refuses to accept html files).

What would be an appropriate license for releasing the tool? Is GPL OK or should it be LGPL? I have noticed that most Elmer source files are released under LGPL. While I know the basic differences between them, I don't know the background regarding the Elmer project.

Rich_B · Post by **Rich_B** » 26 Dec 2021, 23:26

Very nice work!

Here's an excerpt from your archive:

screenshot.png: (79.41 KiB) Not downloaded yet

It's interesting that it identifies when the type of the found keyword doesn't match the Solver.Keywords list.

Rich.

wiesi · Post by **wiesi** » 29 Dec 2021, 20:54

Hi everyone,

you can find the tool on GitLab: https://gitlab.com/wiesi/elmerkeywordextractor
Be sure to read the "Limitations" section in README.md to know what it can do and what not.

raback · Post by **raback** » 03 Jan 2022, 15:56

Hi

Nice work! This thread has touched the important topic on how to control the keywords. So thoughts:

1) As observed, there are some keywords that are treated differently. Mainly these are controls for the model parser itself. These are old and very limited in number. So I don't think is a major issue, maybe just minor annoyance that, for example "Check Keywords warn" does not obey the "Check Keywords = warn" style.

2) There is some inherent conflict between flexibility and enumerating all possible keywords. For example, the users may basically freely choose the name of the field to be solved with keyword "Variable = varname". The library then looks for many derived keywords. For example, the Dirichlet conditions are then given by "varname = Real", nodal loads will be "varname load", soft upper limit "varname upper limit" etc. Basically we have chosen some conventions for the variable name to check the keywords, like "Temperature". Still in test case where we have two heat solvers there are temperature fields "TempA" and "TempB". So either we only list the keywords related to default variable names or then the checking should be done in two phases, first check the variable name and on the 2nd round the derived keywords.

3) Currently the keywords are all in SOLVER.KEYWORDS. The user may use his own additions to this but still this is monolithic list. Maybe one could have FlowSolve.KEYWORDS, HeatSolve.KEYWORDS etc. Then the keywords would be checked in two phases. First, read in Solver modules and then only check the keywords. This would support same kind of modularity for keywords as is in the actual code, and also in ElmerGUI with its XML files. Then the HeatSolve.KEYWORDS and heatsolve.xml would maintain pretty similar information.

4) The detection of missing keywords can be done in two ways. One is to run consistency tests and other cases and let the code write the missing keywords (with the keyword misses flag set on). The other is to analyze the code in systematic manner as is done by wiesi's approach. This second approach which I applaud would be ideal tool if we would also go for the modular keyword system.

Any thoughts?

-Peter

wiesi · Post by **wiesi** » 04 Jan 2022, 14:31

I am a very infrequent Elmer user, mostly for private projects and out of curiosity. So my opinions need to be taken with a grain of salt.
The following options and thoughts are of varying complexity and required effort. Personally, I favor the options that leave less room for error (missing keywords and their documentation) but they are the more complicated ones. Numbering related to Peter's post.

4.1) Extraction using the keyword miss option:

Pros: It is already there and can help identifying missing keywords.
Cons: Requires re-compilation (I think this could be changed to a command line option rather easily). Requires to run test cases that actually use the keywords. Otherwise the miss is not detected, AFAIK. Consequently, it requires up to date testcases for complete results. If a lot of testcases must be executed, the runtime can be quite long for just checking the keyword list.

4.2) Code analysis:

Pros: Some rudimentary parser is there and getting significant coverage seems not too difficult (but the last percents ...)
Cons: Requires quite some effort to make it working reliably and produce good concise output. It can miss new or previously already detectable keywords if the code changes in an unanticipated way. This may be easier if a more complete Fortran parser is used, maybe LLVM or other existing parsers. I lack experience in this field and can only guesstimate, but I expect some coding effort would still be required. Another drawback is that the parser must be maintained in addition the the actual Elmer code. It could also lead to some headaches in corner cases, for example if some heuristics relying on coding conventions must be used.

The effort to output ratio of this option seems to be good, but I don't consider it a clean solution for the reasons given.

Some other options I can think of:

4.3) Make the SOLVER.KEYWORDS entries mandatory.

Pros: May be easy to implement. Maybe just requires to make errors out of warnings.
Cons: Will provoke messages with existing cases until all keywords have been added. Works by enforcing self-discipline of the developers (can be good or bad). Consequently, it should not be easy do disable it. User defined variable names may be more difficult to cover: May require extending the SOLVER.KEYWORDS syntax.

4.4) Require registering keywords:
Similar to 4.3 (thus many of the following points also apply to 4.3), but the other way round: Make the Elmer framework require the registration of keywords before they can be used. Keyword registration should be performed in a function/subroutine that can be called independently of regular solver functions. The SOLVER.KEYWORDS file could then be generated by calling all the registration functions of all linked and/or loaded solvers. Since no actual test cases must be executed, this would be really quick in comparison to the currently existing compile time option for missing keyword checking. Due to the mandatory registration prior to using the keyword, one can also be rather sure of completeness, because it would be the first thing a developer must implement before he/she can use the keyword. Generation of one keyword file per solver would be easy; just output to the corresponding file.

User defined variable names could be handled by registering keywords with yet unspecified name: For documentation purposes it is sufficient to know that, e.g. HeatSolver has a BC of type Real, nodal loads of type Real and so on. This may be enough information for a user to know how to use it, but maybe not if e.g. the GUI wants to use the keyword file.

The SIF syntax could be also be changed to make it less ambiguous (but breaks backward compatibility ...). For example, BC settings could require the solver name as an additional word or option:
Instead of

Code: Select all

Boundary Condition 1
  TempA = Real 0
End

it could be

Code: Select all

Boundary Condition 1
  HeatSolve TempA = Real 0
End

This is somewhat similar to the conventions you mentioned, but enforced. I find it also easier to read if the SIF file was written by someone else who selected funky names.

Cons: Lots of coding effeort.

4.5) Comment "markup"
Keyword information could also be provided in the sourcecode via special comments, similar to Doxygen. Actually, I considered this option before writing the parser but decided against it for multiple reasons: The sourcecode must be checked for keywords anyway. It is not clear where to place the comments. Completeness (and up-to-date-ness) is not enforced, so it is not better than manually editing the keywords file apart from bringing the documentation closer to the source.

4.6) Something else?

Documentation aspect:
From my perspective, the documentation aspect of the keywords file is more important than saving the typing of the datatype specification in the SIF. Consequently, I think it should contain more information: The keywords sections in the ElmerModelsManual.pdf should be generated from the keywords file, i.e. the explanation/description of the keyword should be in the keywords file. Making the description an option of the registration function in 4.4 above would be easy. Doing it the other way round for 4.3 is also easy. This would also facilitate generating context helps for Editors, IDEs, etc. (This is related to the discussion in the Github issue linked in the first post of this thread.)

I don't know the requirements and contents of the XML files for the ElmerGUI, because I rarely use the GUI. However, it sounds like a good idea to have one information database from which the others are derived. In other words: If possible, include additional information required for the XML file in the keywords file (or an per-solver metadata-file) and auto-generate the XML. If the XML is already the more complete database, maybe ditch the keywords file and use the XML only. (Or use something else as basis for both, but avoid the need to make modifications in more than one place.)

The keywords file/database doesn't seem to be the correct place for the theoretical information and derivations also given in the ElmerModelsManual (also because it is not as tightly related to the code as the keywords). However, if all meta-information regarding a solver is stored in a per-solver file or directory, this location might be.

Regarding 3): Yes, it would be nice to see which keywords belong to which Solver. A format according to the deliberations of the above paragraphs would provide this information anyway. Whether this information is contained in one file of appropriate structure or multiple files does not matter that much. Conversion back and forth via scripts would be possible once the information is available in a parsable format.

wiesi

Elmer Discussion Forum

List of keywords, by module

List of keywords, by module

Re: List of keywords, by module

Re: List of keywords, by module

Re: List of keywords, by module

Re: List of keywords, by module

Re: List of keywords, by module

Re: List of keywords, by module

Re: List of keywords, by module

Re: List of keywords, by module

Re: List of keywords, by module