| Home | Trees | Indices | Help |
|---|
|
|
A simple regular expression engine in pure Python.
This takes a set of regular expressions (only the most basic functionality is supported) and generates a finite state machines that can match them against a stream of values.
Although simple (and slow compared to a C version), it has some advantages from being implemented in Python.
- It can use a variety of alphabets - it is not restricted to strings. It could, for example, match lists of integers, or sequences of tokens.
- The NFA implementation can yield intermediate matches.
- It is extensible.
The classes here form a layered series of representations for regular expressions.
The first layer contains the classes Sequence, Option, Repeat, Choice, Labelled and Regexp. These are a high level representation that will typically be constructed by an alphabet-specific parser.
The second layer encodes the regular expression as a non-deterministic finite automaton (NFA) with empty transitions. This is a "natural" representation of the transitions that can be generated by the first layer.
The third layer encodes the regular expression as a deterministic finite automaton (DFA). This is a more "machine friendly" representation that allows for more efficient matching. It is generated by transforming a representation from the second layer.
|
|||
|
Alphabet Regular expressions are generalised over alphabets, which describe the set of acceptable characters. |
|||
|
Sequence A sequence of Characters, etc. |
|||
|
Option An optional sequence of Characters (or sequences). |
|||
|
Empty Matches an empty sequence. |
|||
|
Repeat A sequence of Characters (or sequences) that can repeat 0 or more times. |
|||
|
_Choice A set of alternative Characters (or sequences). |
|||
|
Labelled A labelled sequence. |
|||
|
RegexpError An error associated with (problems in) the regexp implementation. |
|||
|
Compiler Compile an expression. |
|||
|
BaseGraph Describes a collection of connected nodes. |
|||
|
NfaGraph Describes a NFA with epsilon (empty) transitions. |
|||
|
NfaPattern Given a graph this constructs a transition table and an associated matcher. |
|||
|
DfaGraph Describes a DFA where each node is a collection of NFA nodes. |
|||
|
NfaToDfa Convert a NFA graph to a DFA graph (uses the usual superset approach but does combination of states in a way that seems to fit better with the idea of character ranges). |
|||
|
DfaPattern Create a lookup table for a DFA and a matcher to evaluate it. |
|||
|
|||
|
|||
|
|||
_Alphabet = ABCMeta('_Alphabet', (object,), {})
|
|||
RegexpGraphNode = ABCMeta('RegexpGraphNode', (object,), {})
|
|||
| Home | Trees | Indices | Help |
|---|
| Generated by Epydoc 3.0.1 on Sat Jun 9 21:50:50 2012 | http://epydoc.sourceforge.net |