February 20, 2013

Implementing brainfuck, part 2

Update: Faster binaries posted. Source coming shortly.

In Part 1, I presented some benchmarks (see here) for my brainfuck implementation against some of the best.  I'll now lay out the reference interpreter, the optimized interpreter, and source code.

(I'm slower posting all this than I meant to be:  apologies --- dayjob and all...)

To start, the reference interpreter looks like this:
The source code is here: bffsree_ref.c and bfsree.h.
[
updated: fixed missing header]

Executables for Windows and Linux.

Its pretty basic --- pretty much add/subtract, move left/right, get/put, and then loop begin/end. It basically is as basic as you can get, though it include three things that are, only in the loosest sense, "optimizations":

  1. Although the interpreter will ignore non-valid characters at runtime, they are actually stripped at parse time.
  2. I accumulate operations into a counter at parse time (e.g. ++++ becomes +4)
  3. I parse the jump locations for loops at parse time.
These are so simple to do inline that it feels like they hardly qualify, and you'll see all 3 in pretty much every implementation (I'd guess).

And presented below is the optimized implementation that I presented in Part 1:

Note that it is NOT materially different.
The main differences are:

  • The collapse of the command and "helper" arrays
  • Double-instructioning -- whereby it is assumed that to do anything interesting in brainfuck, you always have to move the pointer, so every instruction has a move-the-pointer offset (saves on instruction decodes)
  • The inclusion of the "super instructions" that perform the following common tasks:
    1. Pointer shifting in a loop
    2. Multiplication of a cell by a constant
    3. Setting a cell to a value
    4. Multiplication of a cell by a constant, then zeroing
    5. Multiplying two cells
The first 2 are pretty straighforward -- most of the work is in the third item: collapsing the existing instructions into the optimized "super" instructions without breaking any contracts.  The primary optimizations are around loop invariance.

There are a LOT more optimizations I didn't yet finish tackling -- dead code elimination and constant propagation, in particular -- that should yield materially better performance.

I'll post the details and source to the optimizing implementation in a future post.  

(Oh, btw, forget to mention -- if you use the "-c" option with the optimized implementation from part 1, it will convert your brainfuck program to somewhat performant C code.)

February 10, 2013

Implementing brainfuck

Updated:  Part 2 is now available.

For those not aware, brainfuck is a small programming language whose sole existence is to amuse programmers. It has certainly been amusing.

In that vein, I got myself a nice X1 Carbon Touch with Windows 8 a few weeks ago, and as always with a new laptop, I play with a few projects to "break it in". I have a fun little image viewer that I upgraded to touch (a few more things to clean up - may post it), and one of the other toys I thought I'd play with was a brainfuck interpreter and debugger. The debugger part is important, as I've done a few language implementations before, but not all the way through the tool chain, and thought it'd be informative. Sadly, I'm not through the whole tool chain yet -- still have to finish the editor/debugger, but the basic mechanics for the implementation are in place.

Because I'm competitive, I wanted to also try some interpreter optimization ideas. I'd been kicking them around with a Javascript interpreter I did over the holidays (*cough* 2011 --- will get back to it at some point). Brainfuck seemed simple enough that it would be a nice testbed.

Anyway, I'll post some more details and source code, but thought I'd start with some results and executables to "whet the appetite" as it were.

For those who are unfamiliar, brainfuck has only 8 operators: "<>[]+-,." ... and "Hello World" looks like this:
>+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.+++++++..+++.[-]
>++++++++[<++++>-] <.>+++++++++++[<++++++++>-]<-.--------.+++
.------.--------.[-]>++++++++[<++++>- ]<+.[-]++++++++++.
I first built a reference interpreter, and then found a few fast online intepreters: Alex Pankatrov's moderately optimizing implementation (bff) and Oleg Mazonka's "fastest in class" interpreter (bff4) (according to Wikipedia). They both crushed my reference interpreter, which lead to my optimized implementation. Compiled versions are included here:
UPDATE: Linux binaries updated; had uploaded in text mode (d'oh!).
UPDATE 02.27.13: Slightly faster binaries updated again.

Performance is as follows:
Linux:
program:  mandel.b  hanoi.b  sisihi.b  long.b  
bff6.0s12.3sn/a14.4s
bff46.8s0.5s5.9s3.1s
bffsree2.7s0.05s1.4s0.2s
Windows:
program:   mandel.b   hanoi.b   sisihi.b   long.b  
bff 7.6s 16.2s n/a 17.1s
bff4 6.8s 1.3s 4.1s 3.1s
bffsree 3.4s 0.8s 1.2s 0.16s
Some notes:
  • GCC 4.7 for Windows and 4.6 for Linux were used.
  • bffsree and bff can read from a file or stdin.  bff4 reads from stdin.
  • "sisihi.b" is a brainfuck self interpreter running itself running "hi123.b" -- basically the executable you pick runs a brainfuck program that is a self intepreter running a self interpreter (itself) that runs a simple brainfuck program.  Which is why they call it brainfuck.
  • The difference between Linux speed and Windows speed appears to be gcc version and console speed (Windows console is slow).
  • Both bff and bff4 seem to fail with "bench.b" (or are crazy slow) --- not sure what that's about.
  • Oleg's bff4 appears to have been the fastest since 2009 -- which is quite impressive.
I'll discussion implementation notes in a follow-up post shortly.
Continued here.