Version 1.6 of Aflex, the lexical analyzer, and version 1.4 of Ayacc, the Ada parser generator provide numerous improvements to customize the generated scanner and parser. The major change is the support to write a reentrant scanner and parser. Let's have a look at it.
Reentrant scanner and parser with Aflex and Ayacc
By Stephane Carrez2023-05-14 19:02:00
What's new in Aflex 1.6
- Support the flex options
%option output
,%option nooutput
,%option yywrap
,%option noinput
,%option noyywrap
,%option unput
,%option nounput
,%option bufsize=NNN
to better control the generated_IO
package. - Aflex templates provide more control for tuning the code generation and they are embedded with Advanced Resource Embedder
- Support to define Ada code block in the scanner that is inserted in the generated scanner
- New option -P to generate a private Ada package for DFA and IO
- New directive
%option reentrant
and%yyvar
to generate a recursive scanner - New directive
%yydecl
to allow passing parameters toYYLex
or change the default function name
Example of %option
directives to tell Aflex to avoid generating several function or procedures and customize the buffer size.
%option nounput
%option noinput
%option nooutput
%option noyywrap
%option bufsize=1024
The tool supports some code block injection at various places in the generated scanner. The code block has the following syntax where <block-name>
is the name of the code block:
%<block-name> {
-- Put Ada code here
}
The %yytype
code block can contain type declaration, function and procedure declarations. It is injected within the YYLex
function in the declaration part. The %yyinit
code block can contain statements that are executed at beginning of the YYLex
function. The %yyaction
code block can contain statements that are executed before running any action. The %yywrap
code block can contain statements which are executed when the end of current file is reached to start parsing a next input.
What's new in Ayacc 1.4
- Support the Bison
%define variable value
option to configure the parser generator - Support the Bison
%code name { ... }
directive to insert code verbatim into the output parser - Recognize some Bison variables
api.pure
,api.private
,parse.error
,parse.stacksize
,parse.name
,parse.params
,parse.yyclearin
,parse.yyerrok
,parse.error
- New option
-S skeleton
to allow using an external skeleton file for the parser generator - Ayacc templates provide more control for tuning the code generation and they are embedded with Advanced Resource Embedder
- New option
-P
to generate a private Ada package for the tokens package - Improvement to allow passing parameters to
YYParse
for the grammar rules - New
%lex
directive to control the call ofYYLex
function - Fix #6: ayacc gets stuck creating an infinitely large file after encountering a comment in an action
The generator supports two code block injections, the first one decl
is injected in the YYParse
procedure declaration and the init
is injected as first statements to be executed only once when the procedure is called. The syntax is borrowed from the Bison parser:
%code decl {
-- Put Ada declarations
}
%code init {
-- Put Ada statements
}
Some other Bison like improvements have been introduced to control the generation of the parser code.
%define parse.error true
%define parse.stacksize 256
%define parse.yyclearin false
%define parse.yyerrok false
%define parse.name MyParser
How to use
The easiest way to use Ayacc and Aflex is to use Alire, get the sources, build them and install them. You can do this as follows:
alr get aflex
cd aflex_1.6.0_b3c21d99
alr build
alr install
alr get ayacc
cd ayacc_1.4.0_c06f997f
alr build
alr install
UPDATE: the alr install
command is available only with Alire 2.0.
Using these tools is done in two steps:
- a first step to call
aflex
orayacc
command with the scanner file or grammar file, - a second step to call
gnatchop
to split the generated file in separate Ada files
For example, with a calc_lex.l
scanner file, you would use:
aflex calc_lex.l
gnatchop -w calc_lex.ada
And with a calc.y
grammar file:
ayacc calc.y
gnatchop -w calc.ada
To know more about how to write a scanner file or grammar file, have a look at Aflex 1.5 and Ayacc 1.3.0 which explains more into details some of these aspects.
Highlight on reentrancy
By default Aflex and Ayacc generate a scanner and a parser which use global variables declared in a generated Ada package. These global variables contain some state about the scanner such as the current file being scanned. The Ayacc parser generates on its side two global variables YYLVal
and YYVal
.
Using global variables creates some strong restrictions when using the generated scanner and parser: we can scan and parse only one file at a time. It cannot be used in a multi-thread environment unless the scan and parse is protected from concurrent access. We cannot use easily some grammars that need to recurse and parse another file such as an included file.
Reentrant scanner
The reentrant scanner is created by using the -R
option or the %option reentrant
directive. The scanner will then need a specific declaration with a context parameter that will hold the scanner state and variables. The context parameter has its type generated in the Lexer_IO
package. The %yydecl
directive in the scanner file must be used to declare the YYLex
function with its parameters. By default the name of the context variable is Context
but you can decide to customize and change it to another name by using the %yyvar
directive.
%option reentrant
%yyvar Context
%yydecl function YYLex (Context : in out Lexer_IO.Context_Type) return Token
When the reentrant
option is activated, Aflex will generate a first Context_Type
limited type in the Lexer_DFA
package and another one in the Lexer_IO
package. The generator can probably be improved in the future to provide a single package with a single type declaration. The Lexer_DFA
package contains the internal data structures for the scanner to maintain its state and the Lexer_IO
package holds the input file as well as the YYLVal
and YYVal
values.
Reentrant parser
On its side, Ayacc uses the YYLVal
and YYVal
variables. By default, it generates them in the _tokens
package that contains the list of parser symbols. It must not generate them and it must now use the scanner Context_Type
to hold them as well as the scanner internal state. The setup requires several steps:
- The reentrant parser is activated by using the
%define api.pure
directive similar to the bison %define. - The
%lex
directive must be used to define how theYYLex
function must be called since it now has some context parameter. - The scanner context variable must be declared somewhere, either as parameter to the
YYParse
procedure or as a local variable toYYParse
. This is done using the new%code decl
directive and allows to customize the local declaration part of theYYParse
generated procedure. - We must give visibility of the
YYLVal
andYYVal
values defined in the scanner context variable. Again, we can do this within the%code decl
directive.
A simple reentrant parser could be defined by using:
%define api.pure true
%lex YYLex (Scanner)
%code decl {
Scanner : Lexer_IO.Context_Type;
YYLVal : YYSType renames Scanner.YYLVal;
YYVal : YYSType renames Scanner.YYVal;
}
However, this simple form is not really useful as you may need to open the file and setup the scanner to read from it. It is probably better to pass the scanner context as parameter to the YYParse
procedure. For this, we can use the %define parse.params
directive to control the procedure parameters. The reentrant parser is declared as follows:
%lex YYLex (Scanner)
%define api.pure true
%define parse.params "Scanner : in out Lexer_IO.Context_Type"
%code decl {
YYLVal : YYSType renames Scanner.YYLVal;
YYVal : YYSType renames Scanner.YYVal;
}
To use the reentrant parser and scanner, we only need to declare the scanner context, open the file by using the Lexer_IO.Open_Input
procedure and call the YYParse
procedure as follows:
Scanner : Lexer_IO.Context_Type;
...
Lexer_IO.Open_Input (Scanner, "file-to-scan");
YYParse (Scanner);
Grammar examples:
To have a more complete example of a reentrant parser, you may have a look at the following files:
Tags
- Facelet
- NetBSD
- framework
- Mysql
- generator
- files
- application
- gcc
- ReadyNAS
- Security
- binutils
- ELF
- JSF
- Java
- bacula
- Tutorial
- Apache
- COFF
- collaboration
- planning
- project
- upgrade
- AWA
- C
- EL
- J2EE
- UML
- php
- symfony
- Ethernet
- Ada
- FreeBSD
- Go
- KVM
- MDE
- Proxy
- STM32
- Servlet
- backup
- lvm
- multiprocessing
- web
- Bean
- Jenkins
- release
- OAuth
- ProjectBar
- REST
- Rewrite
- Sqlite
- Storage
- USB
- Ubuntu
- bison
- cache
- crash
- Linux
- firefox
- performance
- interview
Add a comment
To add a comment, you must be connected. Login