Package lepl :: Package lexer :: Module lexer :: Class Lexer
[hide private]
[frames] | no frames]

Class Lexer

source code


This takes a set of regular expressions and provides a matcher that converts a stream into a stream of tokens, passing the new stream to the embedded matcher.

It is added to the matcher graph by the lexer_rewriter; it is not specified explicitly by the user.

Instance Methods [hide private]
 
__init__(self, matcher, tokens, alphabet, discard, t_regexp=None, s_regexp=None)
matcher is the head of the original matcher graph, which will be called with a tokenised stream.
source code
 
token_for_id(self, id_)
A utility that checks the known tokens for a given ID.
source code
 
_tokens(self, stream, max)
Generate tokens, on demand.
source code
 
_match(self, in_stream)
Implement matching - pass token stream to tokens.
source code

Inherited from support.context.NamespaceMixin (private): _lookup

Inherited from matchers.support.BaseMatcher: __repr__, __str__, clone, kargs, tree, tree_repr

Inherited from support.graph.ArgAsAttributeMixin: __iter__

Inherited from support.graph.PostorderWalkerMixin: postorder

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Inherited from matchers.matcher.Matcher: indented_repr

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, matcher, tokens, alphabet, discard, t_regexp=None, s_regexp=None)
(Constructor)

source code 

matcher is the head of the original matcher graph, which will be called with a tokenised stream.

tokens is the set of Token instances that define the lexer.

alphabet is the alphabet for which the regexps are defined.

discard is the regular expression for spaces (which are silently dropped if not token can be matcher).

t_regexp and s_regexp are internally compiled state, use in cloning, and should not be provided by non-cloning callers.

Overrides: matchers.matcher.Matcher.__init__

token_for_id(self, id_)

source code 
A utility that checks the known tokens for a given ID. The ID is used internally, but is (by default) an unfriendly integer value. Note that a lexed stream associates a chunk of input with a list of IDs - more than one regexp may be a maximal match (and this is a feature, not a bug).

_match(self, in_stream)

source code 
Implement matching - pass token stream to tokens.
Decorators:
  • @tagged
Overrides: matchers.matcher.Matcher._match