Is SICP just a book?

SICP stands for “Structure and Interpretation of Computer Programs”, and is an introductory computer science book, written by Hal Ableson and Gerald Sussman for their introductory computer science course given at MIT from 1981 until 2007.  A professional recording done in 1986 and is also available for download online, as is the book itself.

Why so much controversy?

I had missed these comments, and the resultant reddit gossiping – a continuation of the previously argument between two programming languages subgroups.  Guido tweeted as follows:

As I feared, the copy of SICP is from someone who believes I am not properly educated. No thanks for the backhanded insult.

I have to admit, sending someone an introductory computer science text can carry a lot of connotations, whether indented or otherwise.  Guido gave his two cents on the book, tacking it on to a review of some intro programming books.  A little uproar followed in kind. 

My impression

I experienced SICP only this spring – I had heard of its controversial nature second hand, having gone to a Java school.  I noticed a coworker had a copy, so I borrowed it and gave it a weekend skim. 

It really is an introductory text, but it is old style in expecting the reader to be really smart and work at the stuff.  I wouldn’t call it unforgiving though – it doesn’t do any hand holding, but at the same time, it doesn’t give you any problems that you shouldn’t be able to solve.  The examples are practical instead of contrived, but still idealized and simplified.  In case you can’t tell already, let me translate: I was hooked. 

Unlike van Rossum, I very much enjoyed the book.  I also like Coke.  Maybe he likes Pepsi.  What’s the big deal?

It’s not just a book… it’s an idea

If you watch the video lectures, you might see the same thing as me:  I get the sense that SICP isn’t just a book.  Or just instruction.  It is those things, but it’s also a particular manner of thinking, or rather a particular improvement to our manners of expressing our thoughts about software.  In the original first edition preface, the authors write (emphasis mine):

First, we want to establish the idea that a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology.  Thus, programs must be written for people to read, and only incidentally for machines to execute.  Second, we believe that the essential material to be addressed by a subject at this level is […] the techniques used to control the intellectual complexity of large software systems.

The idea, premise, or belief that they demonstrated to me in their lectures is that through abstraction you can manage software complexity, and build anything you can imagine with software.  More importantly, you can design that software in such a way that the code can, piece by piece, be understood and commanded by anyone suitably knowledgable in the art.  The lectures eventually climax with a simplified, but complete, model of the language itself, called the metacircular intepreter.  Using little or no slight of hand, each piece can be understood and composed independently. By comprehending the incantations you can create new ones, without bound, like magic. 

I’m not inventing the magic metaphor– Sussman had a cape and wizard’s staff at the end of that lecture.

It’s a philosophy…

Seriously though, it prescribes a model of reality and method of reason about it.  And like many philosophies, you can make a way of life or practice out of it.  Some people will be really enthusiastic about it, having found much benifit, happiness, or both from it.  Those proponents can expound on its virtues, and the general applicability of those virtues may make it sound as if it’s a cure all. 

And likewise, there will be opponents that have profound distaste for the it.  They may object to the reasoning or axioms of it, or the results of it, or both. In some way it appears to conflicts with their own philosophy; whether the conflict is real or only superficial is not consequential for making them opponents. 

I get the impression that many opponents of SICP are also proponents of software abstraction, but come about it a different way or understand the thing differently.  We’re not really at odds with each other most of the time, but more-so are experiencing a communication failure.

SICP certainly does have an agenda.  But we are seeing it differently.

Epilogue

I can’t claim any philosophic neutrality what-so-ever on SICP. 

I finished watching the lectures this past week, having read the book cover to cover the week prior after ordering it on Amazon.  What can I say, I’ve joined the order of the lambda.

I think Peter Norvig put it best in his review of SICP on Amazon:

I think its fascinating that there is such a split between those who love and hate this book. […]

Those who hate SICP think it doesn’t deliver enough tips and tricks for the amount of time it takes to read. But if you’re like me, you’re not looking for one more trick, rather you’re looking for a way of synthesizing what you already know, and building a rich framework onto which you can add new learning over a career. That’s what SICP has done for me.

Cheers.

The Python debate on Tail Calls

The Beginning

It started with an innocent enough blog post by Guido van Rossum commenting on Python’s lack of tail call elimination in Python’s history.  Okay, not even commenting – it was a parenthetical sentence fragment!

The internet went crazy with comments.  Guido attempted to clarify,

However, some technical inaccuracies – or rather, technical issues that the Python language and runtime currently don’t address – led to only more debate.  Largest oversights were in his discussion about how tail calls might be implemented in Python,

The first observation is that you can’t do it at compile time. I’ve seen at least one blog entry that used a bytecode hack to replace a CALL opcode immediately before a RETURN opcode with a jump to the top of the function body

Going on to imply that tail recursion was somehow thwarted by dynamic function lookup, that it could only be done when the compiler could statically transform all tail call sites into gotos. 

He corrected this in the followup, but then went on to say,

Now, apart from the observation that since TCO essentially is a GOTO, you write spaghetti code using TCO just as easily, Ian Bicking gave a solution that is as simple as it is elegant. …

Where he then presented a classic simple implementation technique for implementing tail call elimination – the trampoline.  Of course, you have to do this manually in Python unless you modify the bytecode compiler to do it for you. 

((Of course, I can’t help pointing out that non-tail-call procedure calls are also like GOTOs.  Guy Steele discussed this analogy/parallel in “Debunking the ‘Expensive Proceedure Call’ Myth, or Proceedure Call Implementations Considered Harmful, or Lambda: the Ultimate GOTO”, available on The Original ‘Lambda Papers’.  I’d summarize if I thought there was anything I could add, but there really isn’t))

Rebuttals

Joe Marshall has provided the most comprehensive replies to this discussion that I’ve seen, in his 5 part series:

Part 4 is the most focused piece I think, where he addresses the issues I mentioned above:

[example of dynamic tail cal] This will work in any version of Scheme. The compiler doesn’t need to analyze the target of the function call, it only needs to notice that the caller’s stack frame is no longer in use.

And the topic of debugging:

MIT-Scheme has an approach that seems promising. […] the history is a ring buffer with a limited size. When a tail recursive call is recorded in the history, each frame in the history moves down by one. If a sufficient number of calls are made, the oldest record is deleted

He also mentions in part 2 that debugging tail calls is just something that Schemers learn. 

But I have this to add:  Yes, it’s terribly convenient to have a stack history when debugging in Python – most of the time you don’t need ever stack frame, but once in a while its useful.  But the same thing happens with loops.  A failure is detected, but this loop iteration didn’t cause the failure, the cause was 2 iterations back and we don’t have that state any longer.  Darn.

A key observation in Scheme is that looping and tail recursion are two abstractions for doing the same thing – iteration over a regular data structure or sequence.  Because the problem is the same, the useful debugging features are the same – in both cases, a history is a useful debugging aid, but also bleeds runtime performance. 

Confusion about the abstraction

I also want to defend Guido though (not that he needs defending, he can stick up for himself, but for the sake of argument).  He doesn’t need tail recursion to do iteration, or visitor dispatch, or etc.  C programmers, those still around, will insist they don’t need C++ classes and templates to do objects, and rightly so.  And a Perl programmer will likely tell you they don’t need anything else to do anything.  (apologies to the Perl programmers, no offense intended).

And in all these cases it’s true.  Guido get things done without tail recursion.  And that’s what Python is about – getting things done.  And Schemers get things done too, in the minimalist way that Schemers like and are accustomed to.

I’m not going to pretend the various programming languages are equal, but I don’t think there’s a total order either.  Even if there is, even if X is a wholey better way-to-do-things than Y, or even mostly better, this whole debate has failed in one very essential way – this discourse was too much a debate about betters and worsers, and not enough a discourse about understanding the other.

I think commenter Martin demonstrates this best: “I fail to see why this is such an important feature that at least a significant amount of people have an opinion about it.” If, after so many words, this is the most common result, then I both sides have failed.

Kudos to Guido for correcting his initial errs, I think he’s done well to describe the Python Way and how Python has chose its alternatives to tail call elimination.

Kudos to Ian for providing trampoline Python code that functions as a working simplified model of tail call elimination – I think that may go a way in explaining TCE to Python developers. 

Kudos to Joe for demonstrating cool debugging features in MIT scheme – before that example, I hadn’t considered the case that TCE debugging tools could help in debugging my loop iterations as well.

And kudos to everyone else out there that also accomplished more listening than talking.

CLOS circa 1987, Dynamic Dispatch in C++/C#

As reported by programming musings,

http://www.archive.org/details/DanielGB1987

Common Lisp Object Standard presentation, by Daniel G Bobrow.

Some reflections:

Fast Multiple Dispatch

"[for dynamic method resolution] A cache of 1000 entries is hit 98% of the time".  Assuming this is accurate, it explains how you could get by with the cached virtual dispatch lookup that the desktop Microsoft CLR runtime does – while not exactly what Bobrow means, 1000s of cache locations spread through the program can get a hit rate fast enough to compensate for the slow resolution path that is taken.

Fastpath Optimization

"What we learned from smalltalk is … something like 60%-70% of generic functions have exactly one method implemented".  I’ve been seeing discussions lately about fastpath optimization in dynamic languages.  The technique is to take advantage of the fact that not all dynamic call sites are very dynamic.  You can sometimes statically dispatch calls (assume various parameters are fixed-width-integers, assume others strings, etc) as long as you have recovery code that can detect the difference before any results are visible.

Tech talks haven’t really changed that much in 20 years; apparently we’ve already long since found the local optimum.  Code samples that fit on slides are always stacked; don’t solve real problems.  Diagrams are always awkward but we keep trying anyway.  Oh, and we didn’t compile the code before putting it in!  But don’t get me wrong, I think this talk was well done, and to be fair, he knew of the error (I assume back then, editing slides was a bit harder than PowerPoint).

Some CLR tie-ins

I think both of these features play a significant role in performance of CLR performance characteristics.

For those not familiar, if you step through a virtual call in MSIL on x86 or x64, you’ll see that each call site caches the address of the function that was last hit from that call site.  Consider the pseudo code:

void some_object::do_something(int x)
{
    //...
}
int main()
{
   some_interface* p = get_an_object();
   p->do_something(42);
}

In C++, typically some_object has a vtable for each interface, and each member function has an entry in this table.  Actual execution looks something like:

// ----  C++ style dispatch ----
struct some_interface
{
    struct vtable_layout
    {
        void (*do_something)(some_interface*, int);
    };
    vtable_layout* vtbl;
};
some_object::some_object()
{
    static vtable_layout vtbl = {
        &some_object::do_something;
    };
    this->vtbl = &vtbl;
}
void some_object::do_something(int x)
{
    //...
}
int main()
{
   some_interface* p = get_an_object();
   p->vtbl->do_something(p, 42);
}

Note that because the dispatch mechanism is tied to the vtbl, then only the first argument (or, rather, only one argument, according to some convention) can affect dispatch.

The CLR dispatches calls differently, more like:

 

// ---- CLR style dispatch ----
struct object
{
    void* type_identifier;
}
void default_do_something(object* p, int x)
{
    void (**memfun_do_something)(object*, int);

    // find address of "last_function_called" variable
    memfun_do_something = find_callsite();

    // fill in value using dynamic type of 'p'
    *memfun_do_something = 
        global_function_table
        .find_type(p->type_identifier)
        .find_member_function("do_something");

    // finally, call the actual member function
    return (*memfun_do_something)(p, x);
}
some_object::some_object
{
    this->type_identifier = some_object_id;
}
void some_object::do_something(int x)
{
    // thunking code -- in case wrong function called
    if (this->type_identifier != some_object_id)
        __tailcall(default_do_something(this, x));
    
    // ...
}
int main()
{
   some_interface* p = get_an_object();

   static void (*last_function_called)(object*, int) 
    = default_do_something;
   last_function_called(p, 42);
}

This is much more complicated – though keep in mind that the function ‘default_do_something’ can be shared for all defaults, using the call site as a hint.  Here, if the default is called or if the ‘last_function_called’ cache is wrong, we pay the hit of searching the type system for the method to call.  However, if the call is on the same object as the last call, then the cache will allow us to make a single indirection initially.  And because it is not using dispatch tables but rather a single type id, we can in general get away with having only a single type id per object instead of one vtbl pointer per interface implemented, saving space as well. 

There are still two indirections going on, but only one indirection that will force a stall the instruction pipeline – the type check can take advantage of out-of-order execution and branch prediction.

Related to CLOS, this scheme’s performance is no longer linked to single dispatch – any lookup scheme can be implemented in default_do_something, provided that a good cache hit rate can be achieved.

As well, when only one implementation exists for a virtual function, the cache hit rate will be 100% after the initial call.  For fast path, this means that you could potentially recompile main() to always call some_object::do_something, and rely on the type check to dispatch correctly if the situation changes.

Understanding Continuations

Experimenting with Lisp variants eventually led me to the programming construct of continuations as popular with the Scheme programming language.  In case  you’re not familiar with the construct, it’s a control construct (like the if statement or exception dispatch) that works a little like this:

Say you’re in the kitchen in front of the refrigerator, thinking about a sandwitch.  You take a continuation right there and stick it in your pocket.  Then you get some turkey and bread out of the refrigerator and make yourself a sandwitch, which is now sitting on the counter.  You invoke the continuation in your pocket, and you find yourself standing in front of the refrigerator again, thinking about a sandwitch.  But fortunately, there’s a sandwitch on the counter, and all the materials used to make it are gone.  So you eat it. 🙂

A discussion on LtU goes on a bit longer, but that’s not the whole of it.  Exceptions can be modeled as escaping continuations; co-routines and threads as mutually invoking continuations.  Continuations are a powerful tool, that can be difficult to understand.  This was my effort to understand them. 

 

My exercise for this is implementing generators — called iterators/enumerators in C#, input iterators in C++ — using continuations in Scheme.  The contract for my generators will be C++ iterator-like.  An iterator value is a function; invoked with ‘get is like dereferncing, and will return the current element; invoked with ‘next will return the next iterator value or #f false.

Some code to illustrate my goal:

(it 'get)              ; get current element
(set! it (it 'next))   ; move forward 
(when (not it) (done)) ; detecting end of iteration

And some pseudo code, C# iterator style:

(define (gen3)
   (let loop ((x 1))
      (yield x)
      (unless (= x 3) (loop (+ x 1)))))

The advantage of this interface design, the C++ iterator concept, is that code can be made to function with multi-pass iterators, iterators that can be saved for resuming iteration or refer to a particular storage location in a container.

First try

(define (writeln x)
  (display x)
  (newline))

(define call/cc call-with-current-continuation)

; generate 1 2 3 as a sequence using special iterators
;  broken: only works from repl because returning to the previous
(define (gen3a)
   (call/cc (lambda (exit)
      (let loop ((x 1))
         (call/cc (lambda (resume)
            (exit (lambda (msg)
               (cond
                  ((eqv? msg 'get) x)
                  ((eqv? msg 'next)
                   (resume '()))
                  (#t (raise-error)))))))
         (unless (= x 3) (loop (+ x 1))))
      (exit #f))))

; and try out this generator
(writeln "gen3a")
(let continue ((it (gen3a)))
   (when it
      (writeln (it 'get))
      (continue (it 'next))))

Paste this into the Scheme REPL and it runs, printing 1 2 3.  But does it work?  I first tried this code out thinking I was halfway there, to see how far it would get.  I was rather surprised when it printed the right answer.

So what’s really happening here?  The first call/cc sets up a fast-path for returning from the function — this is an escaping continuation.  Calling ‘exit’ will now return immediatetly from the gen3 call.

The let begins the loop, just as it did in the earlier pseudo-code.  The next call/cc with resume forms the pseudo-code yield, exiting with a lambda function.  The magic in this lambda is the resume parameter.  It lets us pick back up again right in the middle of the loop! 

It seems like everything is ready to go. The returned lambda becomes the iterator object, and when you invoke with ‘next, it picks back up in the middle of the loop.  So what’s the problem?

The REPL Bug

The issue is really subtle, so don’t blink.  Exit is never modified, right?  So what happens when we exit the second time through the loop?  Well, we resume into the earlier iteration through.  Luckily for the REPL case, that earlier iteration is the same as the current one, but this might not always be the case.  To see this, you first need something that can see the difference:

(let continue ((it (gen3a)) (ix 1))
   (when it 
      (begin
         (writeln (list ix (it 'get)))
         (continue (it 'next) (+ ix 1)))))

In this loop, each call to (it ‘get) should be in a different context for the ix integer.  As a result, this prints (1 1) (1 2) (1 3) — the same value for ix each iteration.  The second call returns to the wrong place!  Even “weirder” things can happen too, all due to using that exit continuation more than once.

The Fix

One potential fix is to replace the exit routine, each iteration, with a new continuation to return to the current caller instead of the initial one.  When we do this, this iterator and its use snap into place, working as intended:

(define (gen3b)
   (call/cc (lambda (exit)
      (let loop ((x 1))
         (set! exit
            (call/cc (lambda (resume)
               (exit (lambda (msg)
                  (cond ((eqv? msg 'get) x)
                        ((eqv? msg 'next)
                         (call/cc (lambda (new-exit) (resume new-exit))))
                        (#t (raise-error))))))))
         (unless (= x 3) (loop (+ x 1))))
      (exit #f))))
(let continue ((it (gen3b)) (ix 1))
   (when it 
      (begin
         (writeln (list ix (it 'get)))
         (continue (it 'next) (+ ix 1)))))

Unfortunately, we’ve introduced side-effects, cementing the iterator as an input iterator, non-restartable.  On a positive note though, the previous loop now works. 

This is a bit ugly at this point — three calls to the mystifying call-with-current-continuation — but it was interesting to me, and offered .  This code isn’t complete, I feel like a bull in a china shop, or maybe a bull writing cobol.  Opinions and one-up-manship welcome.  Some ideas:

  1. Use fewer call/cc operations to do the same thing.  Three definitely feels overkill.
  2. Finish the goal with some nice define-syntax to make this iterator feel like part of the language, or at least language-extension-esq (ala C# 2.0 iterator syntax)
  3. This iterator was written side-effect free up until the set! exit, breaking restart.  Rewrite this generator to be restartable, or in C++ iterator terminology, fulfilling forward iterator requirements.

Note: examples here run under PLT Scheme.  YMMV.

You Can be a Lisper Too!

"Really?" I hear you saying, "you can’t be serious!  Python and Ruby, that’s one thing, but Lisp?  Seriously?!"

About Lisp

Briefly, Lisp is a family of programming languages that date back to 1958.  It is characterized by a very simple syntax, garbage collection, reference semantics, and very powerful language extension and self introspection capabilities.  Many features in common programming languages today were first pioneered in dialects of Lisp.

Common Lisp is the most commonly available and "general purpose" Lisp, and has been out there since 1984.  It supports object-oriented, imperative, functional, and generic programming techniques.

For information on the history of Lisp, please check out the paper The History of Lisp by Lisp’s original designer, John McCarthy — there are a lot of references in that paper to get you started with Lisp’s history.

For information on the present of Lisp, such as learning Common Lisp, I will again recommend Practical Common Lisp by Peter Seibel, available in print form or via said website for no cost.

I thought Lisp was like, esoteric or a dead language or something?

I see a reoccurring barrier to conversation when I mention Lisp — when I mentioned I was learning Python and Ruby, people generally responded positively.  But when I mentioned Lisp, I get incredulous responses, as if I mentioned Intercal or something. 

I think Lisp is being taught *wrong*.  Hidden behind the instruction of recurive function definitions and continuations, there’s a useful imperative programming language there too, and that you can start learning Lisp with its familiar constructs before moving on to the esoteric features. 

Don’t get me wrong, I think the more esoteric features are important!  But you don’t start learning mathematics in multi-variable calculus.

Lets try some imperative examples.

(defun factorial (x)
  "calculate factorial the *boring* way"
  ;; create a local variable
  (let ((accum 1))
    ;; note: from 2 to x includes 2 and x
    (loop for current-factor from 2 to x do
          ;; multiply to variables and assign to accum
          (setf accum (* current-factor accum)))
    accum))

(defun fibonacci (x)
  "calculate fibonacci the *boring* way"
  ;; create 3 locals variables
  (let ((a 1) (b 1) (tmp nil))
    ;; loop x times
    (loop repeat x do
          (setf tmp a)
          (setf a b)
          (setf b (+ tmp a)))
    a))

Oh no you didn’t!  Two functions notorious for recursive definition, written in imperative style.  Imperative style! A beginner (like me) has to start somewhere, and the best way to start with programming is with working code you can understand and then build on.  So if you’ve never used Lisp, then…

Try it!

First, get yourself a Common Lisp implementation.  There are a variety of them out there.  I used Steel Bank Common Lisp on Windows for this example — it’s marked experimental on Windows, but I haven’t run into any problems yet tinkering.  They also have stable versions for Linux, BSD varieties. 

Installation is simple — they have an MSI, but I’d recommend just unzipping it.  Add it to your path or create an alias for sbcl.exe.  Invoke SBCL and you should see:

d:scratch
$ sbcl
This is SBCL 1.0.22, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.

This is experimental prerelease support for the Windows platform: use
at your own risk.  "Your Kitten of Death awaits!"
*

Which means you’re in business.  Copy all the complete code snippets seen here into ‘memo.lisp’, and then you can load it at the SBCL prompt, represented by star (*).

* (load "example.lisp")

T
* (loop for x upto 30 collecting (fib x))

(1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711
 28657 46368 75025 121393 196418 317811 514229 832040 1346269)
*

Loop is a sort of minilanguage in Common Lisp for doing various looping tasks — this one iterates over x from zero to 30, collecting into a list the results of invoking (fib x) for each value.  This is roughly equivalent to Python "[ fib(x) for x in range(21) ]".  Don’t worry about the details for now, just copy and paste for now.

Lastly if you run into any sort of error, you may see a strange prompt, such as:

* (fiib 2)

...

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit debugger, returning to top level.

("bogus stack frame")
0]

Ack!  Don’t worry, this is the debugger, and it seems that every Lisp environment comes with one that is on by default.  To run away, just enter either :q or :a… one of these is bound to escape back to the REPL in just about every Lisp I’ve tried so far.  You can then pick up where you left off.

Eventually learning a debugger will be valuable, but to begin with, just try taking it apart one statement at a time. 

0] :a

* (fib 2)

2
*

Back in business.  When done, execute (quit) or press Ctrl-Z

Where to get Lisp?

I’m currently running all of the following on my dev laptop:

  • Common Lisp
    • SBCL – Steel Bank Common Lisp, as mentioned earlier.
    • ECL – available on almost everything it seems.  Distributed in source form only, but I was able to build it on Windows with Microsoft Visual C++ Express 2008 SP1 in about 5 or 10 minutes on a dinky netbook.
    • GCL – Gnu Common Lisp, available on Windows and Linux.
  • Scheme – (slightly different syntax, will have a post in the future)
    • Chicken Scheme – of interest because it compiles to C using CPS and uses the C stack instead as free-store!
    • MzScheme – available on Windows and Linux.  I understand it also contains graphical ui in addition to traditional command line.
    • IronScheme – an implementation of Scheme for the CLR and DLR runtimes, interoperates with C# and other managed components.  Not sure if it runs on Mono, but runs fine on Windows.

I had one other Lisp implementation installed earlier this week, it was a space hog.  The ones listed above are in the 10-50MB range, so you really have no excuse for not installing one.

Chicken and ECL can both compile to C.  SBCL compiles to native code, and despite being dynamically typed, can compete favorably with some statically typed languages, for example the language shootouts of SBCL against C# and Java.

Pick one, install it, learn Lisp, you’re set.  Make it a new-years resolution 🙂

And don’t forget Practical Common Lisp to get you going.

Memoizer 4b – Common Lisp

Continuing this series, in Common Lisp…

Memoizer

As before, we’ll be walking through a memoizing function — for review, it’s a function that will transforms an existing function into one that remembers previous results instead of recomputing them.

;;; memoizer
(defun memoizer (f)
  (let ((store (make-hash-table)))
    (labels
      ((replace-f (self n)
                  (let ((found (gethash n store)))
                    (if found
                        found
                      (setf (gethash n store) (funcall f self n))))))
      #'replace-f)))

The first part of memoizer uses the Common Lisp function ‘make-hash-table‘ to create a new hash table, and assign it to a new local variable ‘store‘.  You might also see this refered to as a ‘lexical variable‘, which simply means that you can only refer to the variable in that block of the program — a variable with the same name in a different block is a different variable.

(let ((store (make-hash-table)))
  ... )

Next, we’ll create a new replacement function nested in memoizer, named ‘replace-f‘.  As in the Lua and Python examples, nested functions can refer to the outer lexical scope as well, so it can make use of ‘store‘ from within this replacement. 

(labels
  ((replace-f (self n)
              ...))
  ...)

Replace-f works like f — it’s a function that takes two arguments, the first being a function to use for recursion, and the second is the value ‘n‘ for which we’re memoizing.  This version will check for a cached value corresponding to ‘n‘, and if not found, recurse on the original function ‘f‘.

Lastly, we recursively call through the variable ‘f‘, the original function.  In Common Lisp, variables and functions are separate namespaces, so to call through a variable, we use the funcall operator (similar to the .call syntax in Ruby):

(funcall f self n) ; call f with (self n) as arguments

Abstracting recursion

You may have noticed something up with how recursion is being done — instead of calling ‘(f n)‘ i’m calling ‘(f something n)‘.  In the Python example recall, ‘fib’ was first defined to call ‘fib’ recursively, but since the name ‘fib’ is resolved dynamically in Python, by the time it gets called ‘fib’ now points to a different function — it’s no longer auto-recurive.

Alternately, ‘fib’ could be specified to take the replacement as an argument — and I’m doing that here.  Rewriting fib to take itself as an argument:

;;; fixed-point version of fibonacci function
(defun fib-fixed (self n)
  (if (< n 2)
      1
    (+ (funcall self self (- n 1))
       (funcall self self (- n 2)))))

In this case, you can invoke "(fib #’fib 30)" to calculate the 30th fibonacci number, but it will take a while.  We can now memoize this function though, without replacing the original function, by modifying the recursive ‘self’ parameter.

(defun fib (n)
  (let ((memo-fib (memoizer #'fib-fixed)))
    (funcall memo-fib memo-fib n)))

This function will create a new memo of fib, and then invoke with that memo. 

If this seems inside out, you can always use Common Lisp’s dynamic capabilitied to redefine fib like was done in the earlier examples.  Proof by example:

;;; simpler memoizer, doesn't require
;;;  function to provide recursion parameter
(defun simple-memoizer (f)
  (let ((store (make-hash-table)))
    (labels
      ((replacement (n)
                    (let ((found (gethash n store)))
                      (if found
                          found
                        (setf (gethash n store) (funcall f n))))))
      #'replacement)))

;;; define a global named *fib2* that is a function
(defparameter
*fib2*
  (lambda (n)
    (if (< n 2)
        1
      (+ (funcall *fib2* (- n 1))
         (funcall *fib2* (- n 2))))))
;;; replate *fib2* with a memoized copy of the current
;;;  value of *fib2*
(defparameter
*fib2* (simple-memoizer *fib2*))

;;; wrapper function
(defun fib2 (n) (funcall *fib2* n))

Try it out!

Copy all the complete code snippets seen here into a file named ‘memo.lisp’, and then you can load it at the REPL prompt, represented here by star (*).

* (load "memo.lisp")

T
* (loop for x upto 50 collecting (fib x))

(1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711
 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578
 5702887 9227465 14930352 24157817 39088169 63245986 102334155 165580141
 267914296 433494437 701408733 1134903170 1836311903 2971215073 4807526976
 7778742049 12586269025 20365011074)
*

And there ya go!

Lisp Style

Some followup from my previous post on Lisp…

— Lisp Resources —

The book "Practical Common Lisp" is available online, published by the author in its entirety.

Also, a CS professor Slava Akhmechet wrote an essay in the same vein, translating from C++ to Lisp — via Xml of all things. 

(In full disclosure, I’ve *never* seen the light of Xml, and verbally abuse it at every opportunity even if only in a hush tone.  It’s like a reflex.  But more people these days are familiar with Xml trees, and the essay is well written).

— Lisp Style —

Also, I’ve noticed I’ve gotten Lisp style wrong slightly.  It is also common to indent to align with the name of the function or language construct being invoked.  This might look like:

   (defun fib (n)
          (if (< n 2)
              1
            (+ (fib (- x 1))
               (fib (- x 2)))))

Note the left paren aligning bellow to the right of ‘defun’, and ‘1’ aligning to the right of ‘if’.  The lesser indent for the ‘+’ operation is used to distinguish the ‘else’ clause for the programmer. 

Don’t align incorrectly:

   ; this code lies!  the programmer incorrectly
   ; aligned 'do-something', which is part of the
   ; addition expresion
   (if (xyz 1 (* 2 3)
       (do-something))
       (something-else))

Instead:

   ; much better.  now it's clear that 'do-something'
   ; is part of the condition expression
   (if (xyz 1
            (* 2 3)
            (do-something))
       (something-else))

There’s even a Lisp plugin for SlickEdit (my code editor of choice) that will automatically adjust the indent for you, as well as add other editor support, and I hear that Emacs has equally great Lisp editing support.

Memoizer 4a – Lisp for Mortals

Now that this is getting to be a series, I ought to provide links to the previous entries, in case anyone is actually reading this thing:

Other than the C++ example, this series has been part of an experiment to bootstrap my experience in dynamically typed programming languages — and what sort of experiment would it be if I overlooked Lisp.  So here it goes.

All those parenthesis!

I can already hear the cries of some programmers, those who were brought up being told Lisp is interesting, but full of all those inscrutable (((parenthesis))) that make it so hard to read.  I know because I feel the same way — like many others, I come from C and initially fear the parenthesis.

So, to get that out of the way, Lisp parenthesis 101!

In Python, you might write:

  1:     # python, step 0
  2:     def fib(n):
  3:       if n < 2:
  4:         return 1
  5:       else:
  6:         return fib(n-1) + fib(n-2)

Or better yet, In C and C++ you might write:

  1:     // C++, step 0
  2:     int fib(int n){
  3:       if(n < 2) {
  4:          return 1;
  5:       } else {
  6:         return fib(n-1) + fib(n-2);
  7:       }
  8:     }

Lets start translating it to Lisp.

Step One: remove those types in the function signature in C, and replace the Python ‘def’ with ‘defun’:

    # python, step 1
    defun fib (n):
      if n < 2:
        return 1
      else:
        return fib(n-1) + fib(n-2)
    // C++, step 1
    defun fib (n) {
      if(n < 2) {
         return 1;
      } else {
        return fib(n-1) + fib(n-2);
      }
    }

Step Two: Lisp takes away operators, and gives you back consicely named functions.  Pretend that “+” and “<” are simple functions, so “x < y” can be reworded “<(x, y)” and so forth.

    # python, step 2
    defun fib (n):
      if <(n, 2):
        return 1
      else:
        return +(fib(-(n,1)), fib(-(n,2)))
    // C++, step 2
    defun fib (n) {
      if( <(n, 2) ) {
         return 1;
      } else {
        return +(fib(-(n,1)), fib(-(n,2)));
      }
    }

Step Three: Lets replace curly braces {} and indenting with parenthesis. And for good measure, lets put those function names inside the parenthesis so we can always tell which thing is getting which arguments:

    # python, step 3
    (defun fib (n):
      (if (< n 2):
        (return 1)
      else
        (return (+ (fib (- n 1)) (fib (- n 2))))))
    // C++? step 3
    (defun fib (n)
      (if (< n 2)
        (return 1);
      else
        (return (+ (fib (- n 1)) (fib (- n 2))));
      )
    )

Step Four: Now we’re in the home stretch. Those semicolons are just getting in the way now, and I’ve already ripped out the commas from underneath you.  The keywords ‘else’ and ‘return’ are superfluous now.  A little more cleanup, and:

    ; Python-y Lisp! step 4
    (defun fib (n)
      (if (< n 2)
          1
        (+ (fib (- n 1)) (fib (- n 2)))))
    ; C++-y Lisp! step 4
    (defun fib (n)
      (if (< n 2)
          1
        (+ (fib (- n 1)) (fib (- n 2)))
      )
    )

Now the only difference we’re left with is indentation.  C++-y Lispers may want to keep with the closing-brace style above, but Scheme Style would suggest the whitespace/Python-like approach.

Until next time…

I only picked up Practical Common Lisp just today, and this post is long enough as it is, so I’ll save the memoizer for the next post.

Whether you love Lisp or hate it, I hope this post can help you see past all those (((parenthesis))).

Your Favorite PL

In your favorite programming language

I came across this site, hanoimania, when researching application virtual machines.  The author, personally, has translated a towers of hanoi solver to dozens of programming languages, utilizing:

Although, the embedded world uses the ARM architecture more often these days. 

Reminds me of 99 Bottles.

50 in 50

Lastly I’ve been ranting and raving about 50 in 50 from last month, where Guy L Steele Jr. and Richard P. Gabriel gave a sort of artistic performance — 50 statements of 50 words on programming languages.  I’m still looking up historical references from this thing! 

From the performance:

Alt text: We lost the documentation on quantum mechanics. You’ll have to decode the regexes yourself.