pub struct CTLexerBuilder<'a, LexerTypesT: LexerTypes = DefaultLexerTypes<u32>>{ /* private fields */ }
Expand description
A CTLexerBuilder
allows one to specify the criteria for building a statically generated
lexer.
Implementations§
source§impl CTLexerBuilder<'_, DefaultLexerTypes<u32>>
impl CTLexerBuilder<'_, DefaultLexerTypes<u32>>
sourcepub fn new() -> Self
pub fn new() -> Self
Create a new CTLexerBuilder.
source§impl<'a, LexerTypesT: LexerTypes> CTLexerBuilder<'a, LexerTypesT>
impl<'a, LexerTypesT: LexerTypes> CTLexerBuilder<'a, LexerTypesT>
sourcepub fn new_with_lexemet() -> Self
pub fn new_with_lexemet() -> Self
Create a new CTLexerBuilder.
LexerTypesT::StorageT
must be an unsigned integer type (e.g. u8
, u16
) which is big enough
to index all the tokens, rules, and productions in the lexer and less than or equal in size
to usize
(e.g. on a 64-bit machine u128
would be too big). If you are lexing large
files, the additional storage requirements of larger integer types can be noticeable, and
in such cases it can be worth specifying a smaller type. StorageT
defaults to u32
if
unspecified.
§Examples
CTLexerBuilder::<DefaultLexerTypes<u8>>::new_with_lexemet()
.lexer_in_src_dir("grm.l", None)?
.build()?;
sourcepub fn lrpar_config<F>(self, config_func: F) -> Self
pub fn lrpar_config<F>(self, config_func: F) -> Self
An optional convenience function to make it easier to create an (lrlex) lexer and (lrpar)
parser in one shot. The closure passed to this function will be called during
CTLexerBuilder::build: it will be passed an lrpar CTParserBuilder
instance upon which
it can set whatever lrpar options are desired. CTLexerBuilder
will then create both the
compiler and lexer and link them together as required.
§Examples
CTLexerBuilder:::new()
.lrpar_config(|ctp| {
ctp.yacckind(YaccKind::Grmtools)
.grammar_in_src_dir("calc.y")
.unwrap()
})
.lexer_in_src_dir("calc.l")?
.build()?;
sourcepub fn lexer_in_src_dir<P>(self, srcp: P) -> Result<Self, Box<dyn Error>>
pub fn lexer_in_src_dir<P>(self, srcp: P) -> Result<Self, Box<dyn Error>>
Set the input lexer path to a file relative to this project’s src
directory. This will
also set the output path (i.e. you do not need to call CTLexerBuilder::output_path).
For example if a/b.l
is passed as inp
then CTLexerBuilder::build will:
- use
src/a/b.l
as the input file. - write output to a file which can then be imported by calling
lrlex_mod!("a/b.l")
. - create a module in that output file named
b_l
.
You can override the output path and/or module name by calling CTLexerBuilder::output_path and/or CTLexerBuilder::mod_name, respectively, after calling this function.
This is a convenience function that makes it easier to compile lexer files stored in a
project’s src/
directory: please see CTLexerBuilder::build for additional constraints
and information about the generated files. Note also that each .l
file can only be
processed once using this function: if you want to generate multiple lexers from a single
.l
file, you will need to use CTLexerBuilder::output_path.
sourcepub fn lexer_path<P>(self, inp: P) -> Self
pub fn lexer_path<P>(self, inp: P) -> Self
Set the input lexer path to inp
. If specified, you must also call
CTLexerBuilder::output_path. In general it is easier to use
CTLexerBuilder::lexer_in_src_dir.
sourcepub fn output_path<P>(self, outp: P) -> Self
pub fn output_path<P>(self, outp: P) -> Self
Set the output lexer path to outp
. Note that there are no requirements on outp
: the
file can exist anywhere you can create a valid Path to. However, if you wish to use
crate::lrlex_mod! you will need to make sure that outp
is in
std::env::var("OUT_DIR")
or one of its subdirectories.
sourcepub fn lexerkind(self, lexerkind: LexerKind) -> Self
pub fn lexerkind(self, lexerkind: LexerKind) -> Self
Set the type of lexer to be generated to lexerkind
.
sourcepub fn mod_name(self, mod_name: &'a str) -> Self
pub fn mod_name(self, mod_name: &'a str) -> Self
Set the generated module name to mod_name
. If no module name is specified,
process_file
will attempt to create a sensible default based on
the input filename.
sourcepub fn visibility(self, vis: Visibility) -> Self
pub fn visibility(self, vis: Visibility) -> Self
Set the visibility of the generated module to vis
. Defaults to Visibility::Private
.
sourcepub fn rust_edition(self, edition: RustEdition) -> Self
pub fn rust_edition(self, edition: RustEdition) -> Self
Sets the rust edition to be used for generated code. Defaults to the latest edition of rust supported by grmtools.
sourcepub fn rule_ids_map<T: Borrow<HashMap<String, LexerTypesT::StorageT>> + Clone>(
self,
rule_ids_map: T,
) -> Self
pub fn rule_ids_map<T: Borrow<HashMap<String, LexerTypesT::StorageT>> + Clone>( self, rule_ids_map: T, ) -> Self
Set this lexer builder’s map of rule IDs to rule_ids_map
. By default, lexing rules have
arbitrary, but distinct, IDs. Setting the map of rule IDs (from rule names to StorageT
)
allows users to synchronise a lexer and parser and to check that all rules are used by both
parts).
sourcepub fn build(self) -> Result<CTLexer, Box<dyn Error>>
pub fn build(self) -> Result<CTLexer, Box<dyn Error>>
Statically compile the .l
file specified by CTLexerBuilder::lexer_path() into Rust,
placing the output into the file specified by CTLexerBuilder::output_path().
The generated module follows the form:
mod modname {
pub fn lexerdef() -> LexerDef<LexerTypesT> { ... }
...
}
where:
modname
is either:- the module name specified by CTLexerBuilder::mod_name()
- or, if no module name was explicitly specified, then for the file
/a/b/c.l
the module name isc_l
(i.e. the file’s leaf name, minus its extension, with a prefix of_l
).
sourcepub fn process_file_in_src(
self,
srcp: &str,
) -> Result<(Option<HashSet<String>>, Option<HashSet<String>>), Box<dyn Error>>
👎Deprecated since 0.11.0: Please use lexer_in_src_dir() and build() instead
pub fn process_file_in_src( self, srcp: &str, ) -> Result<(Option<HashSet<String>>, Option<HashSet<String>>), Box<dyn Error>>
Given the filename a/b.l
as input, statically compile the file src/a/b.l
into a Rust
module which can then be imported using lrlex_mod!("a/b.l")
. This is a convenience
function around process_file
which makes
it easier to compile .l
files stored in a project’s src/
directory: please see
process_file
for additional constraints and information about the
generated files.
sourcepub fn process_file<P, Q>(
self,
inp: P,
outp: Q,
) -> Result<(Option<HashSet<String>>, Option<HashSet<String>>), Box<dyn Error>>
👎Deprecated since 0.11.0: Please use lexer_in_src_dir() and build() instead
pub fn process_file<P, Q>( self, inp: P, outp: Q, ) -> Result<(Option<HashSet<String>>, Option<HashSet<String>>), Box<dyn Error>>
Statically compile the .l
file inp
into Rust, placing the output into the file outp
.
The latter defines a module as follows:
mod modname {
pub fn lexerdef() -> LexerDef<LexerTypesT::StorageT> { ... }
...
}
where:
modname
is either:- the module name specified
mod_name
- or, if no module name was explicitly specified, then for the file
/a/b/c.l
the module name isc_l
(i.e. the file’s leaf name, minus its extension, with a prefix of_l
).
- the module name specified
sourcepub fn allow_missing_terms_in_lexer(self, allow: bool) -> Self
pub fn allow_missing_terms_in_lexer(self, allow: bool) -> Self
If passed false, tokens used in the grammar but not defined in the lexer will cause a panic at lexer generation time. Defaults to false.
sourcepub fn allow_missing_tokens_in_parser(self, allow: bool) -> Self
pub fn allow_missing_tokens_in_parser(self, allow: bool) -> Self
If passed false, tokens defined in the lexer but not used in the grammar will cause a panic at lexer generation time. Defaults to true (since lexers sometimes define tokens such as reserved words, which are intentionally not in the grammar).
sourcepub fn dot_matches_new_line(self, flag: bool) -> Self
pub fn dot_matches_new_line(self, flag: bool) -> Self
Sets the regex::RegexBuilder
option of the same name.
The default value is true
.
sourcepub fn multi_line(self, flag: bool) -> Self
pub fn multi_line(self, flag: bool) -> Self
Sets the regex::RegexBuilder
option of the same name.
The default value is true
.
sourcepub fn octal(self, flag: bool) -> Self
pub fn octal(self, flag: bool) -> Self
Sets the regex::RegexBuilder
option of the same name.
The default value is true
.
sourcepub fn swap_greed(self, flag: bool) -> Self
pub fn swap_greed(self, flag: bool) -> Self
Sets the regex::RegexBuilder
option of the same name.
Default value is specified by regex.
sourcepub fn ignore_whitespace(self, flag: bool) -> Self
pub fn ignore_whitespace(self, flag: bool) -> Self
Sets the regex::RegexBuilder
option of the same name.
Default value is specified by regex.
sourcepub fn unicode(self, flag: bool) -> Self
pub fn unicode(self, flag: bool) -> Self
Sets the regex::RegexBuilder
option of the same name.
Default value is specified by regex.
sourcepub fn case_insensitive(self, flag: bool) -> Self
pub fn case_insensitive(self, flag: bool) -> Self
Sets the regex::RegexBuilder
option of the same name.
Default value is specified by regex.
sourcepub fn size_limit(self, sz: usize) -> Self
pub fn size_limit(self, sz: usize) -> Self
Sets the regex::RegexBuilder
option of the same name.
Default value is specified by regex.
sourcepub fn dfa_size_limit(self, sz: usize) -> Self
pub fn dfa_size_limit(self, sz: usize) -> Self
Sets the regex::RegexBuilder
option of the same name.
Default value is specified by regex.
sourcepub fn nest_limit(self, lim: u32) -> Self
pub fn nest_limit(self, lim: u32) -> Self
Sets the regex::RegexBuilder
option of the same name.
Default value is specified by regex.