Type Specifications (pytyp.spec.abcs)

The classes defined in this module let you describe the type of data in detail. For example, Seq(Opt(int)) is a sequence of optional integers like [1,2,None,3]. What you do with that information is up to you — pytyp includes some utilities, but you can also build your own.

To help you use these descriptions they support the following features (based on Python’s ABCs):

  • You can subclass them, and then use isinstance() and issubclass() as normal:

    >>> class MyIntegerList(Seq(int)): pass
    >>> issubclass(MyIntegerList, Seq(int))
    True
    >>> issubclass(MyIntegerList, Seq(str))
    False
    
  • You can register another class using the .register() method. The isinstance() and issubclass() methods will then work as though the registered class was a subclass (Rec() is for record — things like dicts).

    >>> class MyRecord: pass
    >>> Rec(a=int,b=str).register(MyRecord)
    >>> issubclass(MyRecord, Rec(a=int,b=str))
    True
    
  • You can register a hashable object (not class!) using the .register_instance() method and then isinstance() will recognise the value.

    >>> class MyList(list):
    ...     def __init__(self, *args): super().__init__(args)
    >>> foo = MyList(1,2,3)
    >>> Seq(int).register_instance(foo)
    >>> isinstance(foo, Seq(int))
    True
    
  • Objects that “act like” the appropriate description will work with isinstance() without any subclassing or registering (but because pytyp needs to check the contents this can be slow, especially for containers with many values).

    >>> isinstance([1,2,3], Seq(int))
    True
    
  • Finally, you can use the type specification to iterate over the data you have. This is lets you process values according to their types (it is how pytyp’s JSON conversion is implemented, for example). This is described in more detail below, in Iteration.

Warning

It is your responsibility to use type specifications correctly. Apart from the (inefficient) case of isinstance() with unregistered objects, no checking is made. The type specification is a label that you add to data to help simplify your program. It is not “static typing”.

Tip

For more detailed information on type specifications see the paper Algebraic ABCs.

Constructors

The different constructors for type specifications are listed below.

Sequences

class pytyp.spec.abcs.Seq[source]

This describes a sequence of values, all of the same type. For example:

>>> isinstance([1,2,3], Seq(int))
True
>>> isinstance(('four', 'five'), Seq(str))
True
>>> isinstance([1,'two',None], Seq(int))
False

If no type is given, then object is assumed (which is the same as “anything”):

>>> isinstance([1,'two',None], Seq())
True

Records

class pytyp.spec.abcs.Rec[source]

This describes records - containers with contents that are accessed via a name. Usually the name is a string:

>>> isinstance({'a':1, 'b':'two'}, Rec(a=int, b=str))
True

But it can also be an integer (unnamed arguments to Rec() are numbered from 0):

>>> isinstance((1, 'two'), Rec(int, str))
True

Or even arbitrary objects:

>>> foo = object()
>>> isinstance({foo: 1}, Rec(_dict={foo: int}))
True

Attributes

class pytyp.spec.abcs.Atr[source]

This describes the attributes on an object. Methods are not supported (instead, use function annotations on the method itself:

>>> class Foo:
...     def __init__(self, a, b):
...         self.a = a
...         self.b = b
>>> foo = Foo(1, 'two')
>>> Atr(a=int, b=str).register_instance(foo)

The Cls() constructor (described below) also has a “shorthand” for defining classes with attributes:

>>> class Bar: pass    
>>> Cls(Bar, a=int, b=str)
And(Cls(Bar),Atr(a=int,b=str))

Alternatives and Optional

class pytyp.spec.abcs.Alt[source]

This describes a value that can have more that one type. For example, Alt(int,str) can be an int or a str:

>>> isinstance(1, Alt(number=int, text=str))
True
>>> isinstance('two', Alt(number=int, text=str))
True
>>> isinstance(3.0, Alt(number=int, text=str))
False

This is like Or() below, but lets you add a name to the different alternatives (this name is available during iteration - see below - and what it means will depend on how the type specification is being used).

class pytyp.spec.abcs.Opt[source]

This describes a common case of Alt() where the value is either the given type, or None.

>>> isinstance(1, Opt(int))
True
>>> isinstance(None, Opt(int))
True
>>> issubclass(Opt(int), Alt(value=int,none=type(None)))
True

And and Or

class pytyp.spec.abcs.And[source]

This describes something with several different types at the same time:

>>> isinstance([1,2,3], And(list, Seq(int)))
True
>>> isinstance((1,2,3), And(list, Seq(int)))
False
>>> isinstance((1,2,3), Seq(int))
True
class pytyp.spec.abcs.Or[source]

This describes something that can is one of several different types (and we don’t know which). It is very like Alt() above, except that the alternatives cannot be named.

>>> isinstance(1, Or(int, str))
True
>>> isinstance('two', Or(int, str))
True
>>> isinstance(3.0, Or(int, str))
False

Classes

class pytyp.spec.abcs.Cls[source]

This describes a particular class. You don’t need to use it normally (just use the class itself), but it is used internally:

>>> Seq(int) is Seq(Cls(int))
True
pytyp.spec.abcs.ANY

A useful pre-defined type specification that matches any object. It is the same as Cls(object) (which can also be written as Cls()).

alias of __Cls

class pytyp.spec.abcs.Sub[source]

This is like Cls(), but uses issubclass() rather than isinstance().

It doesn’t make much sense as a type specification, and is arguably an ugly hack, but it is very useful when using dispatch by type pytyp.spec.dispatch.

Normalisation

pytyp.spec.abcs.normalize

Type specifications are built using constructors like Seq() and Rec(), but it is also possible to use a “shorthand” form, in which () and {} are used for for records, and [] for sequences.

This routine rewrites the shorthand into the standard format.

>>> normalize({'a': int})
Rec(a=int)
>>> normalize((int, str))
Rec(int,str)
>>> normalize([int])
Seq(int)
>>> normalize([])
Seq(Cls(object))
>>> normalize(Opt([int]))
Opt(Seq(int))
>>> normalize([int, str])
Rec(int,str)

alias of _normalize

Iteration

Type specifications contain two methods that can be used to iterate over a container. Both take a callback, which is a function that is called to “do something” with the contents of the container.

If you are used to functional programming, this idea is like a “fold”. It’s a little complicated at first, but once you understand, you’ll find it very powerful.

To show how it can be used I will write a function that displays the contents of a container, together with the associated types. But iteration is much more general than this contrived example - it can be used for many kinds of processing that depend on both the data and the type.

First Attempt (Wrong)

The first attempt is shown below:

>>> def format1(v, s):
...     try:
...         return s._for_each(v, callback1)
...     except AttributeError:
...         return str(v)

>>> def callback1(current, vsn):
...     return '[{}:{}]'.format(current, ';'.join(format1(v, s) for (v, s, _) in vsn))

>>> def show1(value, spec):
...     return spec._for_each(value, callback1)

On each iteration, callback1() generates a set of brackets. The contents of these brackets are generated by format1(), which either starts a new iteration of nested data, or returns the lowest level “atomic” values. So callback1() and format1() are mutually recursive and, together, they “recursively explore” the data (in fact, it is a depth–first search).

The show1() function wraps everything up into a nice interface.

It’s easier to understand with some examples - note how the recursion automatically supports nested data:

>>> show1([1,2,3], Seq(int))
[Seq(int):[int:1];[int:2];[int:3]]
>>> show1([[1,2],('one','two')], Seq(Seq()))
[Seq(Seq(Cls(object))):[Seq(Cls(object)):[Cls(object):1];[Cls(object):2]];[Seq(Cls(object)):[Cls(object):one];[Cls(object):two]]]

Second Attempt (Sum Types)

Unfortnately there is a problem with the code above:

>>> show1([1,2,None], Seq(Opt(int)))
Traceback (most recent call last):
  ...
TypeError: No alternative for 1

This is because _for_each() for the Opt(), Alt() and Or() specifications works differently: the callback receives the different combinations of values and types that are available, and then raise an error.

The reason for this behaviour is that these are “sum types” - they only know that the current value is one of a set of possibilities - and there is no general way to know beforehand which particular type is apropriate. While Seq() calls callback() with different values and specifications, the sum types use the same value, with each possible type. And then raise an error.

So for the sum types we should not iterate over all the data, but instead return the first time we have a value and type that are consistent:

>>> def format2(v, s):
...     try:
...         return s._for_each(v, callback2)
...     except AttributeError:
...         return str(v)

>>> def callback2(current, vsn):
...     if issubclass(current, Sum):
...         for (v, s, _) in vsn:
...             try:
...                 if isinstance(v, s):
...                     return '[{}:{}]'.format(current, format2(v, s))
...             except TypeError:
...                 pass
...     else:
...         return '[{}:{}]'.format(current, ';'.join(format2(v, s) for (v, s, _) in vsn))

>>> def show2(value, spec):
...     return spec._for_each(value, callback2)

>>> show1([1,2,None], Seq(Opt(int)))
[Seq(Opt(int)):[Opt(int):[int:1]];[Opt(int):[int:2]];[Opt(int):[Cls(NoneType):None]]]

This works, but the code is a mess. The callback is divided in two: the first half handles sum types, checking the type is OK before going ahead; the second half is as before, for everything else.

Faced with that, you might reasonably ask several questions, including:

  • What is except TypeError: doing? That handles the case where a nested sum type fails (perhaps at a higher level several types are consistent with the data, but when we get to a lower level some fail). This is still a depth–first search, but with sum types we need to backtrack on failure.
  • Why not test with isinstance() in _for_each() and save the user the work? Unfortunately, that is not always the correct behaviour — perhaps someone will be writing an application to fix type errors, not avoid them. And anyway, iteration is used by pytyp to implement isinstance(), so it must be “more fundamental”.
  • Surely there is a simpler way? Why, yes, there is...

Third Attempt (Backtracking)

Twice above I have mentioned that we are doing a depth–first search. And, for sum types, we must have backtracking. So we can move the logic associated with that into the library. Here’s the result (note that we are now calling _backtrack() instead of _for_each()):

>>> def format3(v, s):
...     if not isinstance(v, s): raise TypeError
...     try:
...         return s._backtrack(v, callback3)
...     except AttributeError:
...         return str(v)

>>> def callback3(current, vsn):
...     return '[{}:{}]'.format(current, ';'.join(format3(v, s) for (v, s, _) in vsn))

>>> def show3(value, spec):
...     return spec._backtrack(value, callback3)

Which is pretty sweet. So what is happening now?

With _backtrack(), only a single combination of value and type are passed to callback3() for sum types. That makes it possible to use the same code for all specifications.

But then what happened to the different possible types? They are handled by the library, which is expecting exceptions. When an exception occurs, it is caught and a new combination is passed in. So all that the client code has to do is throw an error when something is inconsistent: the library will backtrack and try a new type.

Of course, some exceptions should not be treated this way. So if you want an exception to be ignored (ie. to be raised, instead of being used to trigger backtracking), then register it with NoBacktrack.

class pytyp.spec.abcs.NoBacktrack[source]

If this exception (or a subclass, or a registered class) occurs within _backtrack() then it will be allowed to “escape” to the surrounding code (other exceptions will be caught and used to trigger backtracking over sum types).