# C[omp]ute

Welcome to my blog, which was once a mailing list of the same name and is still generated by mail. Please reply via the "comment" links.

Always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Python web; Java enterprise; functional languages; GPGPU; SQL databases; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.

© 2006-2015 Andrew Cooke (site) / post authors (content).

## Unifying OO, FP, Asynch Messages

From: "andrew cooke" <andrew@...>

Date: Wed, 30 May 2007 06:42:36 -0400 (CLT)

Some ideas evolving and/or expressed in different ways...

Version 1

Writing Erlang code to receive messages is very like writing functions.
Why isn't it identical?

First, there are the largely syntactical issues of function names and
currying.  These might be addressed by using "bundles" of anonymous
functions, or associating the tag in a tagged tuple with a function name.

But secondly there is also the problem of thread-local state.  When a
thread is waiting on a message it alread has some state.  This seems to be
similar to partial evaluation with explicit state.

And there is also the idea of synchronisation - messages are only

At the same time, there's the idea that one sends messages to objects.
And objects with explicit state are similar to the partialy evaluated
functions described above.

Binding a partially evaluated function to a public name would "expose" it
for evaluation.  This could be within an existing thread, in which case it
becomes normal evaluation, or it could be associated with a new thread, in
which case it becomes message passing.

The new ideas are:
- Publishing functions with names
- Evaluating functions synchonously or asynchronously
- Bundling a group of functions into a single function
- Partially evaluating functions

An object is a partially evaluated bundle - the partial evaluation is the
state and the bundle defines the methods.

Version 2

Ignore the partial evaluation/currying and relating that to objects -
that's just the usual objects as closures thing.

Instead, consider how to unify message passing and function calling.  What
does that require?  If functions receive messages then the mailbox must be
a separate entity.  But then you are sending messages to mailboxes which
isn't the same as sending them to functions.

Stuck here.

Version 3

Classes are functions that return a pair - the "next" function and a
result.  Functions are anonymous.  Here is a class:

> MkFoo = [A inc B -> Foo A, B + A
>         |A inc B -> Foo A, B - A]
> Foo = MkFoo 2
> Foo inc 3
4
> Foo dec 2
1

Here is another way to define that:

> MkFoo = [A ->
>            Foo A, [inc B -> B + A
>                   |dec B -> B - A]]

Here's another way:

> Inc = [A B -> B + A]
> Dec = [A B -> B - A]
> MkFoo = [A ->
>            Foo A, [inc -> Inc A
>                   |dec -> Dec B]]

What does "," mean?  Seems to be list construction.

"[...]" is not list construction but syntax for function definition.  Or
maybe they are the same?    But then how do we distinguish between plain
lists and parameters without looking ahead to "->"?  And does that mean
"->" is first class?  No.  This is not Lisp-like.

A different example to show persistent state

> MkBar = [A B -> MkBar B, A + B]
> Bar = MkBar 2
> Bar 3
5
> Bar 1
4
> Bar 0
1

Version 4

Bad thing about Version 3 is that there is no clear distinction between
variables bound to pure functions and variables bound to mutable objects.
Maybe the syntax should be such that they can't be bound to objects (think
list comprehensions, for example).

Version 5

Objects cannot be references from more than one thread or we get nasty
things happening.  Easiest way to prevent this is to restrict initial code
to export only functions and constants (not variables9 (as Erlang now).

Version 6

These objects are a kind of fixed point (or iterated point?).  Maybe a
syntax something like:

> {State, Value} <-> function(State, Arg)

where "State" is intentionally repeated and indicates the iterative
nature, Value is returned and Arg is the parameter.  Hmmmm.  Not perfect
:o)

Or maybe

> myFunc State Arg -> ....
>   return State2, Result
> myObj Arg -> {State, Bar} = myFunc State Arg -> Bar

But what is the initial value of State?

> myFunc State Arg -> ....
>   return State2, Result
> myObj Arg -> {Start = State, Bar} = myFunc State Arg -> Bar

Or, returning to something like list comprehensions

> myObj Arg = {Bar | (State, Bar) <- myFunc State Arg, State = Start}

Ooooh!  That works!  Means we need list comprehensions, though :o(
Something like a generator, which is close to a loop...

Version 7

C-- is sleeping.  LLVM is incomprehensible.  GCC is scary.  Parrot seems
best.

### More on OO/FP/Asynch

From: "andrew cooke" <andrew@...>

Date: Wed, 30 May 2007 07:50:36 -0400 (CLT)

The final list comprehension syntax was wrong because we need to return
not the state but the partially evaluated function (otherwise, when used
to receive messages in a state machine, you are stuck in the same state).

> myFun State Arg ->
>   ....
>   (myFun State, Result)
> myObj Arg = {Result | (Fun, Result) <- Fun Arg, Fun = myFun Start}

Result is not in a tuple - that must be a list (variable number of return
values).  In an untyped languge do we need tuples?  Not really - lists are
enough.

Also, to get something like inheritance, you need function bundles

> myFun = {myFun1, myFun2 SomeState, myFun3 _ OtherState}

Using some kind of partial evaluation syntax.

What does foldl look like in this syntax?  The above gives a kind of
generator.  It's not a function - can't be partially evaluated.  Bad
question.  What about parser/combinator?  That doesn't seem right either.
Have I missed what I was aiming at?

Andrew

### More Jabberings on Syntax

From: "andrew cooke" <andrew@...>

Date: Sat, 2 Jun 2007 11:33:02 -0400 (CLT)

I'm sure this makes sense to no-one, and only partially to me but it is
coming together,I hope.

In parallel with this I am exploring Parrot - see wiki at
http://www.acooke.org/andrew/writing/parrot/sfl.html

Anyway...

assuming we have an eval(function, args) for automated code, I think it's
a net win to have a special syntax for 0-arity functions.  This seems to
be necessary in an eager language if you want Haskell's currying and
"open" syntax.

Also, for dispatch on atoms to simulate message passing to objects, we
want to be able to define anon functions easily.

So the following seems to work:

somefun = {arg1 arg2 arg3 -> some definition
|other args     -> other definition
|yet more args  -> another}

For 0-arity functions you simply have {....}.  Evaluation is with "!".

Example:
> {print "hello world}
<thunk>
> {print "hello world}!
hello world

Note that parsing requires special lookahead, but that's just a technical
issue.

This gives a very nice way of introducing laziness, which gives you

if = {true  block1 block2 -> block1!
|false block1 block2 -> block2!
|true  block -> block!
|false block -> {}}

(Is it possible to have variable arity functions?)

> if true {print yes} {print no}
yes
> if false {print yes} {print no}
no

Which is probably similar to code blocks in smalltalk or whatever?

One thing you seem to lose is the ability to do

somefun arg1 arg2 arg3 -> some definition
somefun other args     -> other definition
somefun yet more args  -> another

Maybe that can be added as sugar?  A lot seems to depend on how you use
approach.

Andrew

### One More Step

From: "andrew cooke" <andrew@...>

Date: Sun, 10 Jun 2007 09:05:45 -0400 (CLT)

I've been trying to write this up in a paper and was fed up by the end of
yesterday, because I couldn't pull it together into a coherent whole.
However, in bed last night I think I found the key.

One of the common threads is the iteration of a function over a mutable
kernel.  This is what happens in state machines (processes running against
mailboxes), allows "objects" to be emulated, and accumulates data in list
comprehensions.

Last night I realised the obvious - that it is also present in the "right
hand side" of list comprehensions, which motivates a syntax:

> X = [1, 2, 3];
> Y <- X;
> X;
[2, 3]
> Y;
1
> mkCounter Count inc -> [Count, mkCounter \$ Count + 1];
> C = mkCounter 0;
> A <- C inc;
> B <- C inc;
> A;
0
> B;
1

One remaining problem is what to do if you *only* want the side effect.
> <- C inc;
2
seems a bit silly.

Andrew

### Update

From: "andrew cooke" <andrew@...>

Date: Thu, 14 Jun 2007 08:04:57 -0400 (CLT)

An update on this, because last night I felt I reached what could be
described as the end of an iteration.  That's the positive spin - the
negative version is that it didn't work.  But continuing with the
positive, here are lessons learned so far:

Good ideas related to function evaluation:
- combining functions in series and parallel
- functions with clauses with different numbers of arguments
- ordering of clauses
- partial evaluation of functions
- ability to "freeze" evaluation

Good ideas related to syntax:
- syntax without parentheses (a la Haskell)
(allows eager/lazy mix described above)
- atoms and variables semantically distinct
(allows pattern matching and binding)
- {...} for function clauses
- [...] for lists (with | for append)

- forcing list comprehension syntax to mean something
- chaining state with <- operator ([A|B] = B ...)
(in the end, just ugly and uncomfortable)

Emerging good ideas for next iteration:
- use mailbox to hold state
- partial evaluation with mailbox gives "objects"
- need for "nil" as false?!

Andrew

### Iteration 2

From: "andrew cooke" <andrew@...>

Date: Sun, 17 Jun 2007 18:30:57 -0400 (CLT)

I discarded list comprehension and focussed on mailboxes.

The most interesting new idea comes from a deeper understanding of how
eager evaluation, partial evaluation, variable arity, and "case"
statements in function definitions interact.  While the syntax may look
like currying, functions need to be treated "whole" if the correct case is
to be selected.

For example in:
myfun .atom1 .atom2 42 -> ...;
myfun .atom1 -> {.atom2 "foo" -> ...};
myfun .atom1 var -> ...;
all 3 cases will match myfun .atom1 .atom2, but what happens when a third
argument is applied?  In the first case it must be integer 42.  If it's
not, we expect a different case to be chose - the second case (truly
curried) would be selected but would then fail if the thirs argument was
not the string "foo".

For this iteration the second case would be selected, and then (with a
third argument of .atom3, for example) the program would fail.  To
improves this I seem to need to do at least one of:
- change the syntax to delimit function arguments
- remove variable arity
- disallow partial evaluation

Obviously the last of these is much the most interesting.  So I need to
read up on Snobol...  From wikipedia:  "providing operators for pattern
concatenation and alternation" (which is exactly what is happening here
with function arguments).

Andrew

### Lessons from Icon

From: "andrew cooke" <andrew@...>

Date: Mon, 18 Jun 2007 06:33:42 -0400 (CLT)

It was Icon, more than Snobol, that I needed to investigate (Icon being a
more modern development and a more general language - Snobol is perhaps
more like Awk, with string matching and a reliance on "goto").

At first I couldn't see how Icon's treatment of failure, looping, and
coroutines were related, but I think the following implementation
mechanism clarifies things (no idea if it is correct, but it was useful).

Consider a language with support for both exceptions and continuations.
My impression is that Icon-like functionality can be implemented in such a
language by doing the following:

- Adding an implicit parameter to function calls, which is a stack of
continuations.

- On failure, throwing an exception which contains the continuation stack.

- On requiring a repeat, pushing the current continuation to the stack,
and throwing an exception containing the stack.

- Some operations (fork points), on entry, push continuations for
alternatives onto the stack.

- Various operations (fork points and, fir example, explicit "every")
catch the exceptions described above, pop the topmost continuation from
the stack, and call it (passing the rest of the stack as normal).

- In addition to this "automatic" handling of continuations, also allow
explicit creation and calls to coroutines.

The above, taken together, provides for both backtracking and
coroutine-like looping.

For SFL things can be simplified considerably, since only backtracking is
reduce memory usage we need either (or both):

- An explicit "cut" that discards the coroutine stack

- Automatic detection and elimination of continuations within repeated
sense until you consider very abstract, high level, general routines like
fold).

So, to be more concrete:

- Joining a pair of functions into a bundle creates a fork point as
described above.

- Failure to pattern match throws an exception with continuation stack.

- Explicit notation for failure and cut also needed.

Ideas that might be worth exploring:

- Labelling fork points and allowing a cut to that label (or restarting
from that label).

- Supporting coroutines anyway.

Andrew

### Backtracking

From: "andrew cooke" <andrew@...>

Date: Tue, 19 Jun 2007 09:22:54 -0400 (CLT)

Backtracking isn't trivial to implement :o)

A bundle of single parameter functions forms a choice point.  The
functions are ordered (from whether they appear to the left or right of |
and the first function is evaluated.  If that evaluation fails (typically
because of pattern matching failure between parameter and argument) then
the next function is evaluated instead.  If evaluation succeeds then
evaluation of the bundle as a whole is a success, a value is returned, and

Failure may occur deeper'' within the evaluation of a function ---
through the failure of nested and composed functions, or variable binding
with patterns, or even via explicit failure (the <<< operator).  In all
these cases control returns to the choice point and the next function in
the bundle is evaluated.

The same process extends directly to multiple parameter functions (through
currying) and to the SFL program as a whole.

The backtracking implementation should impose minimal overhead.  In
particular, care needs to be taken with (not) breaking tail call
optimisation.

There are two issues here.  First, how to avoid consuming memory on tight
tail call loops.  Second, how to clear choice points that are no longer
relevant (because the program has returned through them'').  The second
is only an issue if the solution to the first moves choice point handling
from the stack.

Possible (partial) solutions:

- Separate stack of continuations.
- Specify a maximum depth.  Ugly and broken by tight inner loops.
- Name stack entries (via user annotation) and delete any stack between
repeated names.  Works well for high level loops, but not for utility
functions.
- Allow forcing --- none, or perhaps one level only of backtracking. Might
be useful for inner loops.  A use case with higher level functions would
be instructive; want to be as flexible as possible.
- Do use the stack, but ameliorate with some of the above.
- Allow backtracking to be disabled in certain scopes.

Examples:

- One use case for backtracking is accumulate data on success and return
it on failure.  In other words, as loop control.  Here it is the driver
function (the fold or whatever) that needs to be efficient --- the inner,
evaluated function, may use backtracking, but is called separately on each
iteration.
- Main loop in a server, handling incoming messages.  Needs backtracking
but also needs to tail call for next message.

Andrew