If you want to take a beautiful photograph of a planet, use a webcam. Digital imaging and software have revolutionized amateur astrophotography. Because objects within our own solar system are quite bright, long exposures are not generally needed. Rather, videos are used.

Because of the small sensor sizes of webcams, the images are highly magnified, but even so, a planet will only fall across a relatively small amount of pixels. When the videos are played at full speed, the results jitter from breezes and even footsteps, and boil from the distortion of even the clearest skies. But software can seek to align frames on high-contrast features, analyze and average over hundreds of frames, and produce images that compare to the best observatory-based photos of just a few decades ago. (Amateur astrophotography of dim objects in the deep sky is perhaps even more spectacular, but uses different equipment and techniques.)

While I cannot pretend that I’m ever going to grind a 20” mirror so smooth it can focus light within a fraction of a wavelength, my amateur enthusiasm can easily divert me into musing about higher-performing alignment and information processing. Even more pressing was the discovery that every time I tried to capture video on my laptop, the software crashed. This wasn’t particularly shocking, as my astrophotography camera was something I built one Saturday afternoon with a film canister, a Dremel, and an old Logitech Orbit webcam, and I appeared to be the first person to combine this particular model, driver and capture software.

Before I could do anything interesting algorithmically, I needed to get the software to stop crashing the moment the “Play” button was pressed.

Luckily, the software was FOSS and the source code was easy to find. I’d forgotten how intimidating is one’s first glance at a C project—so many files, so little structural guidance! It was actually structured well, but it still took a few pages of notes about the preprocessor before I got a proper debug build up and running. After that, it was quick work to find an “optimized” transform based on the assumption of a 32-bit architecture.

It was here that things got interesting. On the one hand, this project couldn’t be further from the structure of today’s average enterprise application: no objects, no managed memory, no unit tests, no dependency injection framework. On the other hand, the preprocessor actually led to quite plastic source code and, in a way, allowed for the flexibility that one looks to dependency injection for.

The concerns of the original programmers were apparent: They wanted performance. What was interesting was how the history of assumptions made over the course of the application’s several years of active development were so clearly embedded in this low-level code. All that flexibility from the preprocessor was being used to (cleverly) create a single-threaded function that operated on a byte at a time; function calls and unrolled loops were bad, integer math and bit shifting were good.

Ideally, there would have been a version of the transform that would have emphasized clarity rather than performance. I could have #DEFINEd it into existence, profiled it and started an optimization pass based on measurement. Even without two good books on assembly language near at hand (see my May 15 column, “Assembly required”), Intel has recently released version 3.0 of their excellent Threading Building Blocks library. The new TBB is compatible with VS2010 and can run on top of the Windows 7 Concurrency Runtime. While there are only two cores in my laptop, 4-core laptops based on Core i7 are the new hotness, and 6- and 8-core laptops will certainly be available in the not-too-distant future.

Even if the dual cores in my laptop were probably not going to give me a big jump in multithreading performance, I wanted to go back to the original algorithm. Unless you have good knowledge of the why’s and wherefores of previous optimization passes, you should start from scratch. It might be that you end up going down the same optimization path as your predecessors, but you’ll often discover that your needs or constraints lead you down a different path.

In my case, I didn’t want the transform to happen at all! The frame data was being transformed from a 12-bit YUV format to 24 bits of RGB, but for this type of work, the luma (Y) is all that’s needed at capture time.

Of course, I would have to rewrite the UI and storage routines. And if I were going to do all that, would it be wiser for me to just start from scratch and write the thing entirely in C#? As I pondered this, the corner of my eye caught a glimmer through the front window—Sirius was blazing in the twilight. I stepped outside and looked at Orion falling toward the horizon. Saturn was high in the sky, and if the clouds didn’t fill in, I’d get a shot at Omega Centauri in a couple hours.

The software could wait.

Larry O’Brien is a technology consultant, analyst and writer. Read his blog at www.knowing.net.