From: "andrew cooke" <andrew@...>
Date: Sun, 11 Nov 2007 21:50:47 -0300 (CLST)
For some time I have been thinking about a data reduction system that supports both distributed computing and lazy evaluation. The basic idea is that you define a final image in terms of a graph of intermediate images (nodes) and processes (edges). Requesting an image triggers the generation of input images (and so on). One advantage of this approach is that "what-if" exploration is simplified. Another advantage is that it may lead to pipelines "for free" (by generalising the graph). I just stumbled across Pyke - http://pyke.sourceforge.net/ - an inference system written in Python which can associate functions with backwards chaining rules to generate code. A quick sidetrack on logic programming: expert systems and the like can work in two ways, either starting with basic facts and finding rules that lead to the expected result, or starting with the result and finding rules that lead back to the initial conditions. This seems to connect with the data reduction idea. You can imagine a system which has a set of rules about how to reduce data, and which is given a set of input data and a target: produce a final image. The system would infer the correct process and can then construct the graph and generate python code (the pipeline). It's not a silver bullet. Nothing addresses caching, for example, but it seems like a step in the right direction. Andrew