Frequently Asked Questions

Why do I get “Cannot parse regexp...”?

Why do I get “Cannot parse regexp ‘(‘ using ...” for Token(‘(‘)?

String arguments to Token() are treated as regular expressions. Because ( has a special meaning in a regular expression you must escape it, like this: Token('\\('), or like this: Token(r'\(')

Why isn’t my parser matching the full expression? (1)

In the code below:

word = Token('[a-z]+')
lpar = Token('\\(')
rpar = Token('\\)')
expression = word | (word & lpar & word & rpar)

why does expression.parse(‘hello(world)’) match just ‘hello’?

In general LEPL is greedy (it tries to matches the longest possible string), but for Or() it will try alternatives left-to-right. So in this case you should rewrite the parser as:

expression = (word & lpar & word & rpar) | word

Alternatively, you can force the parser to match the entire input by ending with Eos():

expression = word | (word & lpar & word & rpar)
complete = expression & Eos()

Why isn’t my parser matching the full expression? (2)

In the code below:

constant = Float() >> float
matcher = Or (
   constant >> constant_gen,
  ... some other stuff ...

matcher.parse ('2.5x')

[C(2.5)]

It seems Float matched the part it liked, and just ignored the rest.

The simple solution is to add Eos() to the end of the parser, so that the entire stream must match.

But it would be better to use tokens here. You could have one token for numeric values and another for alphanumeric “words”.

How do I parse an entire file?

I understand how to parse a string, but how do I parse an entire file?

Instead of .parse() or .parse_string() (or .match() or .match_string()) use .parse_file() or .parse_path() (or .match_file() or .match_path()).

Matchers extend OperatorMatcher(), which provides these methods.

When I change from > to >> my function isn’t called

Why, when I change my code from:

inverted = Drop('[^') & interval[1:] & Drop(']') > invert

to:

inverted = Drop('[^') & interval[1:] & Drop(']') >> invert

is the `invert` function no longer called?

This is because of operator precedence. >> binds more tightly than >, so >> is applied only to the result from Drop(']'), which is an empty list (because Drop() discards the results). Since the list is empty, the function invert is not called.

To fix this place the entire expression in parentheses:

inverted = (Drop('[^') & interval[1:] & Drop(']')) >> invert