One of the most common questions posted in the programming section of Reddit consists of asking for suggestions for interesting projects to develop. I know the feeling of this predicament—having the skills and needing a meaty problem to solve.
Many years ago, when I was young and the world was bright with possibilities, and programming—that extraordinary activity by which worlds entirely of my own imaginings are created—had the seductive appeal of magic, I read a set of articles in Byte magazine that transformed my view of what was possible. It was late 1985, and Byte was running a series of pieces written by Jonathan Amsterdam about how to create a virtual machine that implemented a virtual processor, followed by a companion set of articles on writing a simple compiler.
These articles arrived at a perfect time in my life. I had just mastered C (after having made a living for several years as a COBOL jockey) and I was looking for system-level software to write. Amsterdam’s VM, assembler and compiler were a perfect project, and by slowly implementing the system he described during the course of the following year, I learned a tremendous amount.
We have all experienced the surge when a program first runs correctly. But few explosions of beta-endorphins and adrenaline can match the one that pours over you when a program you wrote assembles to an executable that runs correctly on a VM you wrote. The multiple layers of joy and satisfaction lift you into a high that lasts for days. Due to time demands, I never went much past this point, but I saw the promised land and knew I could get there.
One limitation at this pre-Internet time was that Amsterdam’s articles, although long, were perforce too short to do justice to the topic. I had to extrapolate portions. Fortunately, I was friendly with a compiler writer who could point me in the right direction when I came to unexpected crossroads.
The value to me of the project cannot be overstated. By having worked at these various low levels, I had an uncommon feel for how compilers and operating systems enabled my code to work. I also had detailed knowledge of how linkers and libraries work. All this was very useful for writing software in C, and it retained much of its value when the world shifted to runtime environments such as the JVM and .NET.
The problem is that the articles are no longer available unless you’ve saved the old hard-copy issues of Byte. However, there exists an excellent alternative, which in some aspects trumps the original series.
It’s a book from MIT Press entitled “The Elements of Computing Systems: Building a Modern Computer from First Principles.” This volume, currently in its second printing, starts at a more fundamental level: the semiconductor components and the binary logic by which they execute instructions. Upon this foundation, the authors, Noam Nisan and Shimon Schocken, build a software implementation of the processor (a CPU emulator), then an assembler, a stack-based VM, and finally a compiler for a small high-level language.
The code is available from the book’s website, and so the book itself focuses on the discussion of the key points rather than printing exhaustive listings. The code itself is written, ironically, in Java. Compiled versions of the software, complete with GUIs to watch internal operations of the CPU emulator and the VM are also available for download. This enables the developer with itchy fingers to jump in at any point. There is even an IDE for writing and debugging programs in the book’s high-level language.
The book, which runs to just over 300 pages, is thoroughly illustrated and very approachable. As a result, I highly encourage developers looking for a project to read through it from the beginning. For the parts they’re not interested in implementing, they can just read the text, get the gist of how it works, and then move on to the portions that are particularly interesting.
There are few alternatives to this text, especially for developers interested in learning to write VMs. Search the Web for information on writing VMs and you’ll come across scattered articles and a few academic papers. There is one book on design and implementation of VMs in C/C++ that cannot be recommended. Other than that, the knowledge seems to be the province of a cabal of rare programmers. Consequently, the volume by Nisan and Schocken discusses some of the basic issues and makes a good starting point.
So, if you know a CS grad or a self-taught developer looking for a satisfying project that is big enough to occupy them for a while and teach them key principles of computer design, this project is just the ticket.
Andrew Binstock is the principal analyst at Pacific Data Works. Read his blog at binstock.blogspot.com.