Package lepl :: Package regexp :: Module core
[hide private]
[frames] | no frames]

Module core

source code

A simple regular expression engine in pure Python.

This takes a set of regular expressions (only the most basic functionality is supported) and generates a finite state machines that can match them against a stream of values.

Although simple (and slow compared to a C version), it has some advantages from being implemented in Python.

The classes here form a layered series of representations for regular expressions.

The first layer contains the classes Sequence, Option, Repeat, Choice, Labelled and Regexp. These are a high level representation that will typically be constructed by an alphabet-specific parser.

The second layer encodes the regular expression as a non-deterministic finite automaton (NFA) with empty transitions. This is a "natural" representation of the transitions that can be generated by the first layer.

The third layer encodes the regular expression as a deterministic finite automaton (DFA). This is a more "machine friendly" representation that allows for more efficient matching. It is generated by transforming a representation from the second layer.

Classes [hide private]
  Alphabet
Regular expressions are generalised over alphabets, which describe the set of acceptable characters.
  Sequence
A sequence of Characters, etc.
  Option
An optional sequence of Characters (or sequences).
  Empty
Matches an empty sequence.
  Repeat
A sequence of Characters (or sequences) that can repeat 0 or more times.
  _Choice
A set of alternative Characters (or sequences).
  Labelled
A labelled sequence.
  RegexpError
An error associated with (problems in) the regexp implementation.
  Compiler
Compile an expression.
  BaseGraph
Describes a collection of connected nodes.
  NfaGraph
Describes a NFA with epsilon (empty) transitions.
  NfaPattern
Given a graph this constructs a transition table and an associated matcher.
  DfaGraph
Describes a DFA where each node is a collection of NFA nodes.
  NfaToDfa
Convert a NFA graph to a DFA graph (uses the usual superset approach but does combination of states in a way that seems to fit better with the idea of character ranges).
  DfaPattern
Create a lookup table for a DFA and a matcher to evaluate it.
Functions [hide private]
 
Choice(alphabet, *children)
Encase a choice in a sequence so that it can be treated in the same way as other components (if we don't do this, then passing it to another sequence will split it into constituents and then sequence them).
source code
Variables [hide private]
  _Alphabet = ABCMeta('_Alphabet', (object,), {})
  RegexpGraphNode = ABCMeta('RegexpGraphNode', (object,), {})