Modifications to the ANTLR grammarΒΆ

The scanner/lexer and parser for the SyReC grammar were generated using the parser generator ANTLR with the grammar being defined in the files TSyrecLexer.g4 and TSyrecParser.g4, the former defines the tokens/terminal symbols of the grammar while the latter defines the productions/non-terminal symbols (both files are located in the git repository under src/core/syrec/parser/antlr/grammar). No C++ code is embedded in the two .g4 files so these files can be used to generate a basic SyReC parser for another programming languages using ANTLR.

The header files generated by the parser generator are located in include/core/syrec/parser/antlr while the source files are found in src/core/syrec/parser/antlr. The files of the ANTLR runtime required for the build of the generated header and source files of the SyReC parser are made available at configure time and are built with the C++ library.

Changes in the .g4 files of the scanner and parser are not automatically available in the C++ header/source files but require an execution of the parser generator with two options for how the parser generator can be invoked:

  1. Include the parser generator into CMake as described in the official ANTLR repository which would require an existing Java SE installation of the system performing the build.

  2. A manual execution of the parser generator on the machine of the developer followed by an update of the already existing files.

To keep the number of external dependencies for this project low and due to the expected infrequent changes to the SyReC grammar, the second option is the recommended one, with a manual execution of the parser generator via the command line requiring the following steps (with the assumption that the execution is performed on Windows with other operating systems only requiring minor changes):

  1. Download the ANTLR java binary and copy it to a location of your choice.

  2. Assuming that the antlr.jar is located in the same folder as the .g4 grammar files and that the Java binary is available in the command line, execute the following command:

    $ java -jar antlr.jar -Dlanguage=Cpp -package syrec_parser -o <OUTPUT_DIRECTORY_FOR_GEN_FILES> -no-visitor -no-listener -Werror TSyrecLexer.g4 TSyrecParser.g4
    

    Out of all the generated files in the specified <OUTPUT_DIRECTORY_FOR_GEN_FILES> directory only the TSyrecLexer.h, TSyrecLexer.cpp, TSyrecParser.h, TSyrecParser.cpp are relevant for the C++ library.

  3. Copy the relevant files generated in the previous step to the corresponding folders in the project:

  • Header files: include/core/syrec/parser/antlr

  • Source files: src/core/syrec/parser/antlr

Note

Executing the .clang-tidy and .clang-format checks for the generated header and source files will result in a large number of warnings being reported that need to be fixed prior to any pull-request. We recommend that one uses a diff-tool to determine the changes between the current implementation and the newly generated code and then merge the relevant portions of the new code into the existing one. Changes in the .g4 grammar files might also require an update of the implementation of the parser components (in src/core/syrec/parser/components).