Lexing

Lexing is the act of taking in an input stream and splitting it into lexemes. Colloquially, lexing is often described as splitting input into words. In grmtools, a Lexeme has a type (e.g. "INT", "ID"), a value (e.g. "23", "xyz"), and knows which part of the user's input matched (e.g. "the input starting at index 7 to index 10"). There is also a simple mechanism to differentiate lexemes of zero length (e.g. DEDENT tokens in Python) from lexemes inserted by error recovery.

A subset of languages can use a simple lex/flex style approach to lexing, for which lrlex can be used. For situations which require more flexibility, users can write their own custom lexer provided it implements the lrpar::lex::NonStreamingLexer trait.