In some applications it is important not only to parse correctly structured input, but also to give a helpful responses when the input is incorrectly structured.
LEPL provides support for reporting errors in the input in two ways. First, it allows a matcher to directly raise an exception. Second, parse tree nodes can be constructed which represent errors; these error nodes can then be used, later, to raise exceptions.
The advantage of the second approach is that it allows for additional context to determine whether an error is present. An error node may be added to the results only to be later discarded during backtracking — information from later in the input stream has shown that the error did not occur.
The implementation of both these approaches is simple, building directly on the functionality already available within LEPL (in particular, Nodes and function invocation). They should therefore be easy to extend to more complex schemes.
Here is an example of both approaches in use:
>>> from lepl import *
>>> class Term(Node): pass
>>> class Factor(Node): pass
>>> class Expression(Node): pass
>>> expr = Delayed()
>>> number = Digit()[1:,...] > 'number'
>>> badChar = AnyBut(Space() | Digit() | '(')[1:,...]
>>> with Separator(r'\s*'):
>>> unopen = number ** make_error('no ( before {stream_out}') & ')'
>>> unclosed = ('(' & expr & Eos()) ** make_error('no ) for {stream_in}')
>>> term = Or(
>>> (number | '(' & expr & ')') > Term,
>>> badChar ^ 'unexpected text: {results[0]}',
>>> unopen >> throw,
>>> unclosed >> throw
>>> )
>>> muldiv = Any('*/') > 'operator'
>>> factor = (term & (muldiv & term)[:]) > Factor
>>> addsub = Any('+-') > 'operator'
>>> expr += (factor & (addsub & factor)[:]) > Expression
>>> line = Empty() & Trace(expr) & Eos()
>>> parser = line.string_parser()
>>> parser('1 + 2 * (3 + 4 - 5')[0]
File "str: '1 + 2 * (3 + 4 - 5'", line 1
1 + 2 * (3 + 4 - 5
^
lepl.error.Error: no ) for '(3 + 4...'
>>> parser('1 + 2 * 3 + 4 - 5)')[0]
File "str: '1 + 2 * 3 + 4 - 5)'", line 1
1 + 2 * 3 + 4 - 5)
^
lepl.error.Error: no ( before ')'
>>> parser('1 + 2 * (3 + four - 5)')[0]
File "str: '1 + 2 * (3 + four - 5)'", line 1
1 + 2 * (3 + four - 5)
^
lepl.error.Error: unexpected text: four
>>> parser('1 + 2 ** (3 + 4 - 5)')[0]
File "str: '1 + 2 ** (3 + 4 - 5)'", line 1
1 + 2 ** (3 + 4 - 5)
^
lepl.error.Error: unexpected text: *
Note
This example follows the > Capitalised; >> lowercase and Use Or() With Complex Alternatives patterns.
Warning
The order of expressions is important in the example above. The default Configuration will change the order of some expressions if the grammar is left–recursive. So if you have a left–recursive grammar and want to use the approach shown to error handling then you must use a custom configuration that excludes the optimize_or(conservative) rewriter. For more information see Memoisation.
| Name | Type | Action |
|---|---|---|
| ^ | Operator | Raises an exception, given a format string. Formatting has the same named parameters as the KApply() matcher (results, stream_in, stream_out); implemented as KApply(raise_error) |
| raise_error | Function | See above. API |
| Error | Class | Creates a parse tree node that can be used to trigger a later exception (Error is a subclass of both Node and SyntaxError). API |
| throw | Function | Walks the parse tree (typically this is a sub–tree associated with a matcher’s result and throw is invoked by Apply()) and raises the first Error found. API. |
| make_error | Function | Creates an Error node, given a format string. API. |