| Andrew Cooke | Contents | Latest | RSS | Twitter | Previous | Next


Welcome to my blog, which was once a mailing list of the same name and is still generated by mail. Please reply via the "comment" links.

Always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Python web; Java enterprise; functional languages; GPGPU; SQL databases; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.

Personal Projects

Lepl parser for Python.

Colorless Green.

Photography around Santiago.

SVG experiment.

Professional Portfolio

Calibration of seismometers.

Data access via web services.

Cache rewrite.

Extending OpenSSH.

C-ORM: docs, API.

Last 100 entries

TP-Link TL-WR841N DNS TCP Bug; TP-Link TL-WR841N as Wireless Bridge; Sending Email On Time; Maybe run a command; Sterile Neutrinos; Strawberry and Banana Jam; The Best Of All Possible Worlds; Kenzaburo Oe: The Changeling; Peach Jam; Taste Test; Strawberry and Raspberry Jam; flac to mp3 on OpenSuse 42.1; Also, Sebald; Kenzaburo Oe Interview; Otake (Kitani Minoru) move Black 121; Is free speech in British universities under threat?; I am actually good at computers; Was This Mansplaining?; WebFaction / LetsEncrypt / General Disappointment; Sensible Philosophy of Science; George Ellis; Misplaced Intuition and Online Communities; More Reading About Japan; Visibilty / Public Comments / Domestic Violence; Ferias de Santiago; More (Clearly Deliberate); Deleted Obit Post; And then a 50 yo male posts this...; We Have Both Kinds Of Contributors; Free Springer Books; Books on Religion; Books on Linguistics; Palestinan Electronica; Books In Anthropology; Taylor Expansions of Spacetime; Info on Juniper; Efficient Stream Processing; The Moral Character of Crypto; Hearing Aid Info; Small Success With Go!; Re: Quick message - This link is broken; Adding Reverb To The Echo Chamber; Sox Audio Tools; Would This Have Been OK?; Honesty only important economically before institutions develop; Stegangraphy via PS4; OpenCL Mess; More Book Recommendations; Good Explanation of Difference Between Majority + Minority; Musical Chairs - Who's The Privileged White Guy; I can see straight men watching this conversation and laffing; Meta Thread Defending POC Causes POC To Close Account; Indigenous People Of Chile; Curry Recipe; Interesting Link On Marginality; A Nuclear Launch Ordered, 1962; More Book Recs (Better Person); It's Nuanced, And I Tried, So Back Off; Marx; The Negative Of Positive; Jenny Holzer Rocks; Huge Article on Cultural Evolution and More; "Ignoring language theory"; Negative Finger Counting; Week 12; Communication Via Telecomm Bids; Finding Suspects Via Relatives' DNA From Non-Crime Databases; Statistics and Information Theory; Ice OK in USA; On The Other Hand; (Current Understanding Of) Chilean Taxes / Contributions; M John Harrison; Playing Games on a Cloud GPU; China Gamifies Real Life; Can't Help Thinking It's Thoughtcrime; Mefi Quotes; Spray Painting Bike Frame; Weeks 10 + 11; Change: No Longer Possible To Merge Metadata; Books on Old Age; Health Tree Maps; MRA - Men's Rights Activists; Writing Good C++14; Risk Assessment - Fukushima; The Future of Advertising and Surveillance; Travelling With Betaferon; I think I know what I dislike so much about Metafilter; Weeks 8 + 9; More; Pastamore - Bad Italian in Vitacura; History Books; Iraq + The (UK) Governing Elite; Answering Some Hard Questions; Pinochet: The Dictator's Shadow; An Outsider's Guide To Julia Packages; Nobody gives a shit; Lepton Decay Irregularity; An Easier Way; Julia's BinDeps (aka How To Install Cairo); Good Example Of Good Police Work (And Anonymity Being Hard); Best Santiago Burgers

© 2006-2015 Andrew Cooke (site) / post authors (content).

40-fold Speedup in LEPL Parsing

From: "andrew cooke" <andrew@...>

Date: Fri, 6 Mar 2009 19:51:54 -0300 (CLST)

OK, so this was a somewhat extreme test case, but by rearranging the
grammar I can make it more likely to terminate and so avoid lots of
pointless recursion.  Here's the code that does the rewriting - I'm
pleased at how simple it is, after all the work making matchers graphs,

def optimize_or(graph):
  When a left-recursive rule is used, it is much more efficient
  if it appears last in an `Or` statement, since that forces the
  alternates (which correspond to the terminating case in a
  recursive function) to be tested before the LMemo limit is

  This rewriting may change the order in which different results
  for an ambiguous grammar are returned.
  for delayed in [x for x in preorder(graph)
                  if type(x) is Delayed]:
    for loop in loops(delayed):
      for i in range(len(loop)):
        if isinstance(loop[i], Or):
          # we cannot be at the end of the list here,
          # since that is a Delayed instance
          matchers = loop[i].matchers
          target = loop[i+1]
          # move target to end of list
          index = matchers.index(target)
          del matchers[index]
  return graph

Even on a more reasonable case (the LMemo example in the docs), the
speedup is about a factor of 2 (that includes some additional tweaks

It's surprising how uniform the profiles are (the flip side of that is
that there's little to optimise from a code-tweaking pov).  The only other
improvement I can think of at the moment is adding a tokenizer (which will
allow LMemo lookahead to use larger "chunks", advancing by words rather
than letters).


Comment on this post