The re2c is a fast lexical analyzer generator that converts regular expression specifications into highly efficient C or C++ code. It is frequently used when building compilers, interpreters, and other applications that require custom tokenizers. This tutorial demonstrates how to install re2c on Ubuntu 26.04.
Prepare environment
Before installing re2c, verify that the GCC compiler is available, as it will be required to build the generated source code.
sudo apt install -y gcc
Install re2c
Refresh the local package index to obtain the latest package information.
sudo apt update
Install the re2c package.
sudo apt install -y re2c
After the installation completes, confirm that re2c is installed correctly by displaying its version.
re2c --version
Testing re2c
Create a new source file named the main.re.
nano main.re
Add the following code:
#include <stdio.h>
typedef enum { END, WORD, NUMBER, UNKNOWN } Token;
Token lex(const char **s) {
const char *p = *s;
/*!re2c
re2c:define:YYCTYPE = char;
re2c:define:YYCURSOR = p;
re2c:yyfill:enable = 0;
[ \t\n]+ { *s = p; return lex(s); }
[a-zA-Z]+ { *s = p; return WORD; }
[0-9]+ { *s = p; return NUMBER; }
"\x00" { return END; }
. { *s = p; return UNKNOWN; }
*/
}
int main(void) {
const char *input = "Hello 42 world";
Token tok;
while ((tok = lex(&input)) != END) {
printf("Token: %d\n", tok);
}
return 0;
}
This example defines a simple lexer that recognizes alphabetic words, numeric values, and unknown characters.
Generate a C source file from the re2c specification:
re2c main.re -o main.c
Compile the generated source code:
gcc main.c -o test
Run the executable:
./test
Expected output:
Token: 1
Token: 2
Token: 1
Uninstall re2c
If re2c is no longer required, remove the package by running the command:
sudo apt purge --autoremove -y re2c
Leave a Comment
Cancel reply