Home  Trees  Indices  Help 


An adhoc parser for regular expressions. I think it's best to consider this as recursive descent with handwritten trampolining, but you can also consider the matchers as states in a state machine. Whatever it is, it works quite nicely, and exploits inheritance well. I call the matchers/states "builders".
Builders have references to their callers and construct the graph through those references (ultimately accumulating the graph nodes in the root SequenceBuilder).


SequenceBuilder Parse a sequence (this is the main entry point for parsing, but users will normally call parse_pattern). 

RepeatBuilder Parse simple repetition expressions (*, + and ?). 

GroupEscapeBuilder Parse "group escapes"  expressions of the form (?X...). 

ParserStateBuilder Parse embedded flags  expressions of the form (?i), (?m) etc. 

BaseGroupBuilder Support for parsing groups. 

GroupBuilder Parse groups  expressions of the form (...) containing subexpressions, like (ab[ce]*). 

LookbackBuilder Parse lookback expressions of the form (?<...). 

LookaheadBuilder Parse lookahead expressions of the form (?=...) and (?!...), along with lookback expressions (via LookbackBuilder). 

GroupConditionalBuilder Parse (?(id/name)yespatternnopattern) expressions. 

YesNoBuilder A helper for GroupConditionBuilder that parses the subexpressions.


NamedGroupBuilder Parse '(?P<name>pattern)' and '(?P=name)' by creating either a matching group (and associating the name with the group number) or a group reference (for the group number). 

CommentGroupBuilder Parse comments  expressions of the form (#...). 

CharacterBuilder Parse a character range  expressions of the form [...]. 

SimpleEscapeBuilder Parse the standard escaped characters, character codes (x, u and U, by delegating to CharacterCodeBuilder), and octal codes (000 etc, by delegating to OctalEscapeBuilder) 

IntermediateEscapeBuilder Extend SimpleEscapeBuilder to also handle group references (1 etc). 

ComplexEscapeBuilder Extend IntermediateEscapeBuilder to handle character classes (b, s etc). 

CharacterCodeBuilder Parse character code escapes  expressions of the form x..., u..., and U.... 

OctalEscapeBuilder Parse octal character code escapes  expressions of the form 000. 

GroupReferenceBuilder Parse group references  expressions of the form 1. 

CountBuilder Parse explicit counted repeats  expressions of the form ...{n,m}. 





Home  Trees  Indices  Help 

Generated by Epydoc 3.0.1 on Tue Jun 29 03:38:21 2010  http://epydoc.sourceforge.net 