Home  Trees  Indices  Help 


A simple regular expression engine in pure Python.
This takes a set of regular expressions (only the most basic functionality is supported) and generates a finite state machines that can match them against a stream of values.
Although simple (and slow compared to a C version), it has some advantages from being implemented in Python.
 It can use a variety of alphabets  it is not restricted to strings. It could, for example, match lists of integers, or sequences of tokens.
 The NFA implementation can yield intermediate matches.
 It is extensible.
The classes here form a layered series of representations for regular expressions.
The first layer contains the classes Sequence, Option, Repeat, Choice, Labelled and Regexp. These are a high level representation that will typically be constructed by an alphabetspecific parser.
The second layer encodes the regular expression as a nondeterministic finite automaton (NFA) with empty transitions. This is a "natural" representation of the transitions that can be generated by the first layer.
The third layer encodes the regular expression as a deterministic finite automaton (DFA). This is a more "machine friendly" representation that allows for more efficient matching. It is generated by transforming a representation from the second layer.


Alphabet Regular expressions are generalised over alphabets, which describe the set of acceptable characters. 

Sequence A sequence of Characters, etc. 

Option An optional sequence of Characters (or sequences). 

Empty Matches an empty sequence. 

Repeat A sequence of Characters (or sequences) that can repeat 0 or more times. 

_Choice A set of alternative Characters (or sequences). 

Labelled A labelled sequence. 

RegexpError An error associated with (problems in) the regexp implementation. 

Compiler Compile an expression. 

BaseGraph Describes a collection of connected nodes. 

NfaGraph Describes a NFA with epsilon (empty) transitions. 

NfaPattern Given a graph this constructs a transition table and an associated matcher. 

DfaGraph Describes a DFA where each node is a collection of NFA nodes. 

NfaToDfa Convert a NFA graph to a DFA graph (uses the usual superset approach but does combination of states in a way that seems to fit better with the idea of character ranges). 

DfaPattern Create a lookup table for a DFA and a matcher to evaluate it. 





_Alphabet = ABCMeta('_Alphabet', (object,), {})


RegexpGraphNode = ABCMeta('RegexpGraphNode', (object,), {})

Home  Trees  Indices  Help 

Generated by Epydoc 3.0.1 on Sat Jun 9 21:50:50 2012  http://epydoc.sourceforge.net 