(Mostly) Imperative Packrat Parsing

I’ve been tinkering with monadic parser combinators combined with continuations (see Monads, Mother of All), but my latest toy simplifies all that by using PEG grammars. PEGs and CFGs are nearly interchangable for most parsing of computer languages, and can be parsed in linear time and space by packrat parsing — parsing with backtracking, but uses memoization to bound backtracking cost to a constant factor as parsing work is not repeated.

The end result will be succinctly:

def number():
  x = digit()
  while test(digit):
    x = x*10 + digit()
  return x

print parse(number, "15")

Continue reading “(Mostly) Imperative Packrat Parsing”

Non-determinism from generators

UPDATE(7/3/2013) I’ve found an earlier use of this technique written about on a blog, Valued Lessons – Monads in Python from 2008. Please read it too!


I was tinkering a bit tonight with simulating nondeterminism a-la McCarthy’s Amb operator — in Python. This is peculiar as Amb is usually implemented using either first class continuations or syntactic abstractions. Python supports neither.

So what is non-deterministic programming? Continue reading “Non-determinism from generators”


A little Python snippet, good for a diversion. I couldn’t fit it into a tweet though:

s=[];op=dict((o,eval('lambda a,b:b%sa'%o)) for o in '+-*/%')
while not any(s.append(op[w](s.pop(),s.pop())if w in op 
else float(w)) for w in raw_input().split()):
 print '=',s[-1]

Just a simple 4-function RPN calculator, accepting input from console:

 = 1.0
 2 2 +
 = 4.0
 1 2 -
 = -1.0
 2 3 * 1 +
 = 7.0
 2 3 1 * +
 = 5.0

Bondage and Discipline Python

Python has an identity crisis sometimes. It starts with the premise, from Guido’s prior work on ABC, to make a simple but easy to understand language.

But then turns around and cries out “one way to do it“, leaving the programmer perplexed as to how Guido van Rossum thought we should do things. For example, Guido hates tail calls, so recursion isn’t the one way to do it that he picked (note that his blog post and followups contain a large number of factual errors; read it as an opinion piece only).

During these fits, Python suffers itself to be a bondage and discipline language.

Apparently there is hope. One bit that causes me particular pain is that nested functions cannot rebind variables in the outer scope; only read from them. A case in point:

def f():
  x = 1
  def doubleIt():
    x *= 2 # local variable 'x' referenced before assignment
  doubleIt(); doubleIt();
  return x;

However, I was reading a piece on Factor, and it mentions that this restriction is lifted in Python 3.0. I’m still on 2.6 (the only differences I had been aware of were the somewhat arbitrary swapping of the ‘/’ and ” operators, and the insulting removal of map, filter, and reduce from Python 3.0). However, fixing this scoping rule, even if an extra keyword is needed, sure would be convenient for me.

Actually, there are a lot of improvements:

  • Various APIs returns views instead of mutable list (copies).
  • sorted is now built in. (I’ve had to write that myself so many times…)
  • Apparently map and filter are still here (though he still went and put reduce all the way off in functools. I guess you can’t have it all -_-)
  • Set literals, yay!

Now, if Guido would be so kind to add proper statement support to lambdas, I’ll switch ^_^. Oh, but Guido hates lambdas. Oh well, next time maybe.

— signed an ambivalent Python user

Eval in Python

I’ll just leave this here for you:

Wait what. Python compiles? That is correct. CPython and PyPy (the implementations worth caring about currently) are in fact creating a code object from the string you pass to exec or eval before executing it. And that’s just one of the things many people don’t know about the exec statement.

It doesn’t always have to be about theory, nor should it be. I might recommend this article for programmers already versed in Python, to read side by side or leading up to SICP’s “Metalinguistic Abstractions” chapter.

Quickrefs for Python and Vim

More quick reference booklets:

I don’t really need the qr for Python, but I thought it might be good to have ready should a friend a coworker wish to start learning Python.

As for Vim, the reference is certainly terrible – I’ve barely used it at all. The qr is just a list of commands that looked useful. I’ll edit the qr as I go, perhaps using the last pages for notes.

Memoizer revisited – Python

A conversation with a coworker entertained the idea of giving a talk on dynamic typed programming language patterns to a group of C++ developers — not that we’re experts, but as part of my team’s initiative for continued education for our engineers, individuals go out and research different ideas, techniques, technologies and present the results to the group so we can, in theory, learn more than each of us could learn in isolation.

Of course, if I want to have any chance of saying anything intelligent, I would need to have more working experience than ‘having tinkered with Perl 5 at one point’. 

So, I’m learning Python.

Starting at the beginning

The first thing I did is go out and download IronPython and CPython, and install both.  Start up the runtime and:


Ooo, interactive console.  I remember this from Matlab and Maple in college.  Okay, okay, my experience is out of date, but I’m learning.  I wasted like a week on one C++ project last fighting with the limitations of a hand-rolled interactive shell.  Having one out of the box is already a major win, and I’m forgetting how I lived without one for so long. 

After half an hour with O’Reilly’s Python pocket reference, I had our friendly naive Fibonacci sequence generator:

>>> def fib(x):
...   if x < 2:
...     return 1
...   else:
...     return fib(x-1) + fib(x-2)

And we can invoke it with a range of integers 0 thru 9 with the built in map() function and range() functions

>>> map( fib, range(10) )
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

To the code

So, moving to a source file "memoize.py", memoize becomes some special magic.  This is short and sweet, and I like it even better than when I tinkered with JavaScript closures:


def memoize(f):
  "single argument memoization for results of 'f'"
  store = {}
  def anon(x):
    if store.has_key(x):
      return store[x]
      y = f(x)
      store[x] = y
      return y
  return anon

This is hardly complete, but works to illustrate.  How this works: when memoize is invoked with a function, ‘f’, it sets up a new hash table named ‘store’.  It then on the fly defines a new function ‘anon’ that uses both the original function ‘f’ and the new hash table.  This not-anonymous function is then returned from memoize.

You can now replace fib with the statement

fib = memoize(fib)

This new function object, returned from the call, is assigned to the ‘fib’ variable.  This function object will in turn sometimes invokes the original function, which calls ‘fib’, which now points to the memoizer function object, and not the original function.  Function names are just variables, and can be assigned.

As far as I know, there is no syntax for a function to call itself — it can only call through a variable, and that variable can be reassigned if you choose to.


I would be done, but before finishing up this article, I luckily stumbled across decorator syntax — which is exactly this, a shorthand for saying ‘f = g(f)’.  So after defining memoize, we can skip a set and define ‘fib’ pre memoized:

def fib(x):
  "computes fibonacci numbers"
  if x < 2:
    return 1
    return fib(x-1) + fib(x-2)

Paste both of these into "memoize.py", and at the interactive console:

>>> from memoize import fib
>>> fib(100)

And that’s it.  In all, less than half the size of the C++ example. 


And don’t forget to try out the interative console at http://play1.pypy.org/console/.  The code samples above work, and on my machine (Asus eee 901, Win 7, Opera 9.6), it takes less  than a second to compute fib(800).

EDIT ADD: After this posting, I came across documentation for a decorator module in python.  This module apparently provides a set of helper functions for defining decorators.  Among the examples listed is a function named ‘memoize’ that is almost line-for-line identical with my definition above, except that it supports multiple arguments, and uses diifferent dictionary lookup.

In practice, you will want to scope the lifetime of the dictionary — one way to do this is to only use the memoize decorator on nested functions which have definite or bounded scope.