There are three main components in RXPY:
In addition, there are two supporting components:
So, for example, the expression (?P<number>[0-9]+)|\w* is compiled to the graph shown (the entry point is not indicated, but would be ...|... in this case), but the interpretation of [0-9] and \w will depend on the alphabet used (it will not be the same for ASCII and Unicode).
The parser is a hand-written state machine. States are classes and the code is (in my opinion) quite simple and easy to extend.
The source and API docs for the parser are here.
Each graph node represents a single opcode for the engine (although engines are free to rewrite the graph and/or change the interpretation).
The source and API docs for the graph are here.
Graph nodes also support the use of a visitor. Calling any node’s visit method, passing in a Visitor(), triggers the appropriate callback, with the correct parameters.
An engine must use the graph (generated by the parser) to find a match in the input text.
RXPY currently has only one engine. The source and API docs for the simple engine are here.
The simple engine works as an interpreter, using the Visitor() interface (see comments on graphs above). But more complex approaches are also possible. For example, the graph could be used to generate opcodes in a more traditional form for a C-based engine, or the graph could undergo further analysis and rewriting.
The alphabet is used in two separate ways.
The source and API docs for the alphabets are here.