| Andrew Cooke | Contents | Latest | RSS | Twitter | Previous | Next


Welcome to my blog, which was once a mailing list of the same name and is still generated by mail. Please reply via the "comment" links.

Always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Python web; Java enterprise; functional languages; GPGPU; SQL databases; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.

Personal Projects

Lepl parser for Python.

Colorless Green.

Photography around Santiago.

SVG experiment.

Professional Portfolio

Calibration of seismometers.

Data access via web services.

Cache rewrite.

Extending OpenSSH.

Last 100 entries

Chat Wars; Real-life Financial Co Without ACID Database...; Flexible Muscle-Based Locomotion for Bipedal Creatures; SQL Performance Explained; The Little Manual of API Design; Multiple Word Sizes; CRC - Next Steps; FizzBuzz; Update on CRCs; Decent Links / Discussion Community; Automated Reasoning About LLVM Optimizations and Undefined Behavior; A Painless Guide To CRC Error Detection Algorithms; Tests in Julia; Dave Eggers: what's so funny about peace, love and Starship?; Cello - High Level C Programming; autoreconf needs tar; Will Self Goes To Heathrow; Top 5 BioInformatics Papers; Vasovagal Response; Good Food in Vina; Chilean Drug Criminals Use Subsitution Cipher; Adrenaline; Stiglitz on the Impact of Technology; Why Not; How I Am 5; Lenovo X240 OpenSuse 13.1; NSA and GCHQ - Psychological Trolls; Finite Fields in Julia (Defining Your Own Number Type); Julian Assange; Starting Qemu on OpenSuse; Noisy GAs/TMs; Venezuela; Reinstalling GRUB with EFI; Instructions For Disabling KDE Indexing; Evolving Speakers; Changing Salt Size in Simple Crypt 3.0.0; Logarithmic Map (Moved); More Info; Words Found in Voynich Manuscript; An Inventory Of 3D Space-Filling Curves; Foxes Using Magnetic Fields To Hunt; 5 Rounds RC5 No Rotation; JP Morgan and Madoff; Ori - Secure, Distributed File System; Physical Unclonable Functions (PUFs); Prejudice on Reddit; Recursion OK; Optimizing Julia Code; Cash Handouts in Brazil; Couple Nice Music Videos; It Also Works!; Adaptive Plaintext; It Works!; RC5 Without Rotation (2); 8 Years...; Attack Against Encrypted Linux Disks; Pushing Back On NSA At IETF; Summary of Experimental Ethics; Very Good Talk On Security, Snowden; Locusts are Grasshoppers!; Vagrant (OpenSuse and IDEs); Interesting Take On Mandela's Context; Haskell Cabal O(n^2) / O(n) Fix; How I Am 4; Chilean Charity Supporting Women; Doing SSH right; Festival of Urban Intervention; Neat Idea - Wormholes Provide Entanglement; And a Link....; Simple Encryption for Python 2.7; OpenSuse 13.1 Is Better!; Little Gain...; More Details on Technofull Damage; Palmrest Cracked Too....; Tecnofull (Lenovo Support) Is Fucking Useless; The Neuroscientist Who Discovered He Was a Psychopath; Interpolating Polynomials; Bottlehead Crack as Pre-amp; Ooops K702!; Bottlehead Crack, AKG K701; Breaking RC5 Without Rotation; Great post thank you; Big Balls of Mud; Phabricator - Tools for working together; Amazing Julia RC5 Code Parameterized By Word Size; Chi-Square Can Be Two-Sided; Why Do Brits Accept Surveillance?; Statistics Done Wrong; Mesas Trape from Bravo; European Report on Crypto Primitives and Protocols; Interesting Omissions; Oryx And Crake (Margaret Atwood); Music and Theory; My Arduino Programs; Elliptic Curve Crypto; Re: Licensing Interpreted Code; Licensing Interpreted Code; ASUS 1015E-DS03 OpenSuse 12.3 SSD; translating lettuce feature files into stub steps files; Re: translating lettuce feature files into stub steps files; A Tale of Two Psychiatrists

© 2006-2013 Andrew Cooke (site) / post authors (content).

Lessons Learned with Erlang

From: "andrew cooke" <andrew@...>

Date: Mon, 4 Jun 2007 06:35:15 -0400 (CLT)

I have implemented various versions of a "Parallel Sudoku Solver" in
Erlang.  These notes try to capture some of the lessons I have

They are written with an eye on Jonathan Sillito's recent work adding
similar (message sending) functionality to Ruby - I think there are
some interesting issues about "transplanting" ideas from one language
to another (one that includes mutable state but excludes efficient
tail calls) and, at a more practical level, implementing state
machines in an OO language (something I have worried about,
intermittently, for years).

Implementation Details

A quick reading of the latest code at
http://www.acooke.org/cute/ReducedRan0.html shows that it is an
almost-declarative description of a state machine.  Each state
(starting, searching, asserting, swapping, sleeping) is embodied as a

Messages are received in a particular state.  The logic in that state
(compactly described with pattern matching) selects a transition.
Transitions are helper functions that may send a response message
before calling a new state.

Tail calls are used throughout.  Since Erlang "optimises" such calls
no stack is consumed in the transition and the implementation does not
have to be concerned with "returning" (an alien concept for state

Some messages are common to more than one transition.  These are
handled by a single function which takes the following state as an
argument (first class functions simulating continuation passing).

Traditional Functional Strengths

The code illustrates how Erlang plays to functional programming's
traditional strengths:

- Pattern matching (extended in Erlang with conditions using a
  restricted range of predicates) allows program logic to be expressed
  in a clear, concise manner.

- Explicit state and tail call optimisation simplify transitions
  between states.

- First class functions allow easy generalisation of transitions.

Together these make it easy to directly embed a state machine in the
source code.

Erlang also provides, in the OTP platform, "behaviour" support for
state machines.  I have not yet investigated this.

No Magic Bullet

Erlang's message passing approach does not "magically" fix the
problems inherent in concurrent software.  During development I
stumbled across deadlocks and inconsistent state more than once.  What
it does, however, is allow the code to be structured so that the
assumptions, state and logic are easy to read.  This helps ad-hoc
reasoning while debugging and allows issues to be resolved relatively
quickly (this is a relatively subjective view and I have no more
objective evidence).

Lessons Learned

The following approaches (patterns?) were helpful:

- Match for unexpected messages (and fail with a message when you
  receive them).  This is analogous to the default case in case

- Use a fixed format for mutable state (ie the arguments to the
  functions that represent states in the FSM).  This allows general
  transitions which take the "next" state/function as an argument.

- Use tagged tuples for messages (ie first tuple value is an atom that
  "names" the message).  This is a recognised custom in Erlang and
  gives ad-hoc typing.

- Use {name, {payload...}} pairs for messages (ie the payload is a
  separate tuple).  This allows matching on the name (eg {type,
  _Payload}) without hard-coding the payload structure in each

- Fail deeply.  When failure occurs, fall back to as basic a state as
  possible.  Avoid states that loop on failure, immediately repeating
  the action, since this encourages deadlock (an example directly from
  the code: in the swapping state, failure to swap was initially
  handled by a second attempt at swapping.  This eventually lead to
  deadlock with all cells requesting swaps from other cells.)

- Avoid invariants.  This is a hard one, because invariants are so
  useful, but maintaining an invariant across several threads requires
  detailed negotiation that cannot always "fail deeply" (in the
  "community invariant" described below a cell that attempts to swap
  its value with another cell cannot process further requests that
  modify its state - assertions or swaps - until it has a reply from
  the swap partner).

- Use explicit, restricted protocols.  "Fail deeply" and "avoid
  invariants" drive the system towards a "permissive" protocol.  This
  should be resisted as much as possible, since it becomes difficult
  to detect errors (ie "Match for unexpected messages" becomes less

Ruby Library

My main concern for the Ruby library is that a lack of efficient tail
calls means that complex state machines cannot be easily separated
into distinct functions (one per state).

There seem to be two solutions: either use trampolining or represent
state as a variable rather than a function.  It would be interesting
to explore these two avenues - it may be that they suggest an
alternative syntax better suited to Ruby.

States as Objects

The natural approach to state machines in an OO language without
support for unlimited tail calls seems to be:

- Each state is an object

- Each state implements a standard interface ("run" or similar) that
  returns the *next* state

- Have a outer object (the state "machine") that manages the process
  of calling the standard interface on the successive states.

- As I noted 8 years ago - http://www.acooke.org/andrew/smachine.html
  - machines can themselves be states.

Reflecting a little on the comments above it becomes clear that this
combines both approaches described in the previous section - the state
is a variable whose value is the current state and the outer machine
object is the "trampoline".

So I guess I have answered my own question - it should be possible to
adapt the code to use an approach like this.  The remaining issues (eg
does each state create the "next" state on every call, or is there a
pool?) seem to be mainly book-keeping.

Translating to Ruby

If the above is correct then most comments here also apply to code in
Ruby.  The main changes are:

- Tail calls are replaced by the trampolining approach described

- "Generic" transitions would be methods in a shared superclass rather
  than functions that take a "next state" function.

Algorithm Discussion

The algorithm used is, obviously, a poor choice for solving Sudoku
puzzles.  Peter Norvig has already given a textbook-perfect solution
at http://norvig.com/sudoku.html

However, the parallel approach it is still useful as an example for
learning Erlang/message passing:

- It addresses a fashionable issue, which is motivating (received
  attention from Reddit, for example).

- The initial approach is manageable for a newbie.

- Progressively more complex solutions to be introduced (eg
  maintaining the "community invariant" so that one of each digit is
  used in each "block of 9").

- Finally, it is also possible that in some situations (although, in
  my experience, less often than one believes when faced with an
  apparently difficult problem) "guided random" exploration of the
  solution space is the only tool left.

Motivated by the last point, and because I find the issues
interesting, I will use the remaining sections of this note to comment
on the various strategies used (the history is somewhat simplified to
present the main points - in practice there were many more dead-ends
and mistakes).

Algorithm: Random Exploration

The original implementation modelled each cell as an independent
entity aware only of those cells with which it must not (in the final
solution) conflict.  At random it would send a message to one of these
"competitors" asserting a value.  If the two cells conflicted, one of
the cells had to change to a new (random) value.

While this is pretty much brain-dead as a practical optimisation
scheme it has one critical property that we rely on and must preserve:
its fixed point is the solution required.  In other words, if it ever
reaches a consistent solution it will not change values further
(because there are no conflicting cells).  This allows us to detect a
successful solution by monitoring the (lack of) activity in the

Algorithm: Community Invariant

The solution space explored initially is huge (9^81 = 2e77 candidate
solutions).  This can be drastically reduced by guaranteeing that each
digit from 1 to 9 is used just once in each "block" on the sudoku
grid.  This reduces the number of candidate solutions to 9!^9 = 1e50
(an improvement factor of 2e27).

Implementing this invariant took some care - see the "Lessons Learned"
for more details.

Algorithm: Reduced Range

Cells maintain a list of "legal values" which is modified when they
interact with a "given" value.  In this way Cells "learn" to avoid
values that are guaranteed to conflict.  This is "first order"
constraint propagation.  I have not measured the effect of this change
but we can make a rough estimate: if each digit can conflict with two
given values, on average, then any particular arrangement of a block
has a probability of (7/9)^9 = 0.1 of being correct (wildly assuming
independence).  That reduces the number of candidate solutions by a
further nine (number of blocks) orders of magnitude.

Algorithm: Weighting

Each cell maintains a "weight" that reflects the "confidence" in the
current value.  New values have zero weight, and weight increases when
an assertion from a neighbouring does *not* conflict.

Weighting can be used to guide behaviour when two cells interact,
comparing and exchanging values.  When comparing values the cell with
lower weight is the one that changes; swapping is only successful with
a lower weighted value in the same block.

In this way weighting increases the persistence of more likely values
and so "guides" the system to a self-consistent solution.

Algorithm: Greediness

The simple weighting described above is too strong.  It encourages the
growth of self-consistent "structures" of cells, but does not allow
these structures to be discarded when they block the correct solution.

The "greediness factor" introduces some noise into the weighting
decisions.  A greediness of 0.9 means, roughly, that weighting is
followed only 90% of the time.  In this way even strongly weighted
cells can be forced to change value.

Watching the program run (if configured to print a snapshot every
second or so) it is easy to see how varying the greediness parameter
affects the solution search.  A value of 0.9 gives a system that
oscillates between fairly long periods of detailed refinement, when
only a few "uncertain" values are changed, and shorter intermittent
sessions of more violent change, when nearly all values value.  With a
value of 0.5 the detailed searches are shorter; "wide-scale" changes
are more common.

While useful, an adjustable parameter requires tuning, which is
difficult when the test case takes several hours to run.  Also, there
is no guarantee that a single "good" value of greediness exists for
all puzzles.


Parallel Sudoku solver in Stage

From: "andrew cooke" <andrew@...>

Date: Tue, 5 Jun 2007 07:01:39 -0400 (CLT)

Jonathan Sillito's work, which I mentioned above, is now online at


Comment on this post