Type Specifications (pytyp.spec.abcs)

The classes defined in this module let you describe the type of data in detail. For example, Seq(Opt(int)) is a sequence of optional integers like [1,2,None,3]. What you do with that information is up to you — pytyp includes some utilities, but you can also build your own.

To help you use these descriptions they support the following features (based on Python’s ABCs):

  • You can subclass them, and then use isinstance() and issubclass() as normal:

    >>> class MyIntegerList(Seq(int)): pass
    >>> issubclass(MyIntegerList, Seq(int))
    >>> issubclass(MyIntegerList, Seq(str))
  • You can register another class using the .register() method. The isinstance() and issubclass() methods will then work as though the registered class was a subclass (Rec() is for record — things like dicts).

    >>> class MyRecord: pass
    >>> Rec(a=int,b=str).register(MyRecord)
    >>> issubclass(MyRecord, Rec(a=int,b=str))
  • You can register a hashable object (not class!) using the .register_instance() method and then isinstance() will recognise the value.

    >>> class MyList(list):
    ...     def __init__(self, *args): super().__init__(args)
    >>> foo = MyList(1,2,3)
    >>> Seq(int).register_instance(foo)
    >>> isinstance(foo, Seq(int))
  • Objects that “act like” the appropriate description will work with isinstance() without any subclassing or registering (but because pytyp needs to check the contents this can be slow, especially for containers with many values).

    >>> isinstance([1,2,3], Seq(int))
  • Finally, you can use the type specification to iterate over the data you have. This is lets you process values according to their types (it is how pytyp’s JSON conversion is implemented, for example). This is described in more detail below, in Iteration.


It is your responsibility to use type specifications correctly. Apart from the (inefficient) case of isinstance() with unregistered objects, no checking is made. The type specification is a label that you add to data to help simplify your program. It is not “static typing”.


For more detailed information on type specifications see the paper Algebraic ABCs.


Type specifications contain two methods that can be used to iterate over a container. Both take a callback, which is a function that is called to “do something” with the contents of the container.

If you are used to functional programming, this idea is like a “fold”. It’s a little complicated at first, but once you understand, you’ll find it very powerful.

To show how it can be used I will write a function that displays the contents of a container, together with the associated types. But iteration is much more general than this contrived example - it can be used for many kinds of processing that depend on both the data and the type.

First Attempt (Wrong)

The first attempt is shown below:

>>> def format1(v, s):
...     try:
...         return s._for_each(v, callback1)
...     except AttributeError:
...         return str(v)

>>> def callback1(current, vsn):
...     return '[{}:{}]'.format(current, ';'.join(format1(v, s) for (v, s, _) in vsn))

>>> def show1(value, spec):
...     return spec._for_each(value, callback1)

On each iteration, callback1() generates a set of brackets. The contents of these brackets are generated by format1(), which either starts a new iteration of nested data, or returns the lowest level “atomic” values. So callback1() and format1() are mutually recursive and, together, they “recursively explore” the data (in fact, it is a depth–first search).

The show1() function wraps everything up into a nice interface.

It’s easier to understand with some examples - note how the recursion automatically supports nested data:

>>> show1([1,2,3], Seq(int))
>>> show1([[1,2],('one','two')], Seq(Seq()))

Second Attempt (Sum Types)

Unfortnately there is a problem with the code above:

>>> show1([1,2,None], Seq(Opt(int)))
Traceback (most recent call last):
TypeError: No alternative for 1

This is because _for_each() for the Opt(), Alt() and Or() specifications works differently: the callback receives the different combinations of values and types that are available, and then raise an error.

The reason for this behaviour is that these are “sum types” - they only know that the current value is one of a set of possibilities - and there is no general way to know beforehand which particular type is apropriate. While Seq() calls callback() with different values and specifications, the sum types use the same value, with each possible type. And then raise an error.

So for the sum types we should not iterate over all the data, but instead return the first time we have a value and type that are consistent:

>>> def format2(v, s):
...     try:
...         return s._for_each(v, callback2)
...     except AttributeError:
...         return str(v)

>>> def callback2(current, vsn):
...     if issubclass(current, Sum):
...         for (v, s, _) in vsn:
...             try:
...                 if isinstance(v, s):
...                     return '[{}:{}]'.format(current, format2(v, s))
...             except TypeError:
...                 pass
...     else:
...         return '[{}:{}]'.format(current, ';'.join(format2(v, s) for (v, s, _) in vsn))

>>> def show2(value, spec):
...     return spec._for_each(value, callback2)

>>> show1([1,2,None], Seq(Opt(int)))

This works, but the code is a mess. The callback is divided in two: the first half handles sum types, checking the type is OK before going ahead; the second half is as before, for everything else.

Faced with that, you might reasonably ask several questions, including:

  • What is except TypeError: doing? That handles the case where a nested sum type fails (perhaps at a higher level several types are consistent with the data, but when we get to a lower level some fail). This is still a depth–first search, but with sum types we need to backtrack on failure.
  • Why not test with isinstance() in _for_each() and save the user the work? Unfortunately, that is not always the correct behaviour — perhaps someone will be writing an application to fix type errors, not avoid them. And anyway, iteration is used by pytyp to implement isinstance(), so it must be “more fundamental”.
  • Surely there is a simpler way? Why, yes, there is...

Third Attempt (Backtracking)

Twice above I have mentioned that we are doing a depth–first search. And, for sum types, we must have backtracking. So we can move the logic associated with that into the library. Here’s the result (note that we are now calling _backtrack() instead of _for_each()):

>>> def format3(v, s):
...     if not isinstance(v, s): raise TypeError
...     try:
...         return s._backtrack(v, callback3)
...     except AttributeError:
...         return str(v)

>>> def callback3(current, vsn):
...     return '[{}:{}]'.format(current, ';'.join(format3(v, s) for (v, s, _) in vsn))

>>> def show3(value, spec):
...     return spec._backtrack(value, callback3)

Which is pretty sweet. So what is happening now?

With _backtrack(), only a single combination of value and type are passed to callback3() for sum types. That makes it possible to use the same code for all specifications.

But then what happened to the different possible types? They are handled by the library, which is expecting exceptions. When an exception occurs, it is caught and a new combination is passed in. So all that the client code has to do is throw an error when something is inconsistent: the library will backtrack and try a new type.

Of course, some exceptions should not be treated this way. So if you want an exception to be ignored (ie. to be raised, instead of being used to trigger backtracking), then register it with NoBacktrack.