| Andrew Cooke | Contents | Latest | RSS | Twitter | Previous | Next


Welcome to my blog, which was once a mailing list of the same name and is still generated by mail. Please reply via the "comment" links.

Always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Python web; Java enterprise; functional languages; GPGPU; SQL databases; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.

Personal Projects

Lepl parser for Python.

Colorless Green.

Photography around Santiago.

SVG experiment.

Professional Portfolio

Calibration of seismometers.

Data access via web services.

Cache rewrite.

Extending OpenSSH.

C-ORM: docs, API.

Last 100 entries

Small Success With Go!; Re: Quick message - This link is broken; Adding Reverb To The Echo Chamber; Sox Audio Tools; Would This Have Been OK?; Honesty only important economically before institutions develop; Stegangraphy via PS4; OpenCL Mess; More Book Recommendations; Good Explanation of Difference Between Majority + Minority; Musical Chairs - Who's The Privileged White Guy; I can see straight men watching this conversation and laffing; When it's Actually a Source of Indignation and Disgust; Meta Thread Defending POC Causes POC To Close Account; Indigenous People Of Chile; Curry Recipe; Interesting Link On Marginality; A Nuclear Launch Ordered, 1962; More Book Recs (Better Person); It's Nuanced, And I Tried, So Back Off; Marx; The Negative Of Positive; Jenny Holzer Rocks; Huge Article on Cultural Evolution and More; "Ignoring language theory"; Negative Finger Counting; Week 12; Communication Via Telecomm Bids; Finding Suspects Via Relatives' DNA From Non-Crime Databases; Statistics and Information Theory; Ice OK in USA; On The Other Hand; (Current Understanding Of) Chilean Taxes / Contributions; M John Harrison; Playing Games on a Cloud GPU; China Gamifies Real Life; Can't Help Thinking It's Thoughtcrime; Mefi Quotes; Spray Painting Bike Frame; Weeks 10 + 11; Change: No Longer Possible To Merge Metadata; Books on Old Age; Health Tree Maps; MRA - Men's Rights Activists; Writing Good C++14; Risk Assessment - Fukushima; The Future of Advertising and Surveillance; Travelling With Betaferon; I think I know what I dislike so much about Metafilter; Weeks 8 + 9; More; Pastamore - Bad Italian in Vitacura; History Books; Iraq + The (UK) Governing Elite; Answering Some Hard Questions; Pinochet: The Dictator's Shadow; An Outsider's Guide To Julia Packages; Nobody gives a shit; Lepton Decay Irregularity; An Easier Way; Julia's BinDeps (aka How To Install Cairo); Good Example Of Good Police Work (And Anonymity Being Hard); Best Santiago Burgers; Also; Michael Emmerich (Vibrator Translator) Interview (Japanese Books); Clarice Lispector (Brazillian Writer); Books On Evolution; Looks like Ara (Modular Phone) is dead; Index - Translations From Chile; More Emotion in Chilean Wines; Week 7; Aeon Magazine (Science-ish); QM, Deutsch, Constructor Theory; Interesting Talk Transcripts; Interesting Suggestion Of Election Fraud; "Hard" Books; Articles or Papers on depolarizing the US; Textbook for "QM as complex probabilities"; SFO Get Libor Trader (14 years); Why Are There Still So Many Jobs?; Navier Stokes Incomplete; More on Benford; FBI Claimed Vandalism; Architectural Tessellation; Also: Go, Blake's 7; Delusions of Gender (book); Crypto AG DID work with NSA / GCHQ; UNUMS (Universal Number Format); MOOCs (Massive Open Online Courses); Interesting Looking Game; Euler's Theorem for Polynomials; Weeks 3-6; Reddit Comment; Differential Cryptanalysis For Dummies; Japanese Graphic Design; Books To Be Re-Read; And Today I Learned Bugs Need Clear Examples; Factoring a 67 bit prime in your head; Islamic Geometric Art; Useful Julia Backtraces from Tasks; Nothing, however, is lost with less discomfort than that which, when lost, cannot be missed

© 2006-2015 Andrew Cooke (site) / post authors (content).

40-fold Speedup in LEPL Parsing

From: "andrew cooke" <andrew@...>

Date: Fri, 6 Mar 2009 19:51:54 -0300 (CLST)

OK, so this was a somewhat extreme test case, but by rearranging the
grammar I can make it more likely to terminate and so avoid lots of
pointless recursion.  Here's the code that does the rewriting - I'm
pleased at how simple it is, after all the work making matchers graphs,

def optimize_or(graph):
  When a left-recursive rule is used, it is much more efficient
  if it appears last in an `Or` statement, since that forces the
  alternates (which correspond to the terminating case in a
  recursive function) to be tested before the LMemo limit is

  This rewriting may change the order in which different results
  for an ambiguous grammar are returned.
  for delayed in [x for x in preorder(graph)
                  if type(x) is Delayed]:
    for loop in loops(delayed):
      for i in range(len(loop)):
        if isinstance(loop[i], Or):
          # we cannot be at the end of the list here,
          # since that is a Delayed instance
          matchers = loop[i].matchers
          target = loop[i+1]
          # move target to end of list
          index = matchers.index(target)
          del matchers[index]
  return graph

Even on a more reasonable case (the LMemo example in the docs), the
speedup is about a factor of 2 (that includes some additional tweaks

It's surprising how uniform the profiles are (the flip side of that is
that there's little to optimise from a code-tweaking pov).  The only other
improvement I can think of at the moment is adding a tokenizer (which will
allow LMemo lookahead to use larger "chunks", advancing by words rather
than letters).


Comment on this post