Yesterday, I spoke with Arun C. Murthy of Hortonworks. We chatted about the future of Hadoop, and specifically about YARN, a.k.a. Map/Reduce 2.0.
One of the things he said, which struck me as something you don’t often hear from enterprise software companies, was that the work being done on Hadoop 2.0 is being undertaken with the far-flung future in mind. While everyone says they’re thinking about the future in their architecture and design phases, it takes a lot of process, control and, most importantly, time to actually build for the future.
In the heady world of cloudy, distributed, super-agile development, it can be easy to get bogged down in the day-to-day add-a-drop-down-box-here mentality of meeting the needs of your enterprise. Sometimes, application longevity can be a bit like a lottery, with the winning software earning itself an unchanging duty for the next 40 years, while half the other applications you’re building end up being in a constant state of flux, always in danger of being absorbed, canceled, or pulled out and rewritten.
Of course, it’s not really like a lottery. The core systems in your enterprise’s value chain have probably been thought out and planned for future endeavors. I know of a major enterprise whose core business-processing component is still written in VAX assembly language. Why? Because the guys that wrote it optimized the system down to the bit. It’s never going to get any faster or more reliable, even though it now runs in an emulated environment.
But as our societal addiction to technology continues to grow, what does “the future” even mean? Did those VAX developers know their software would still be driving the company, 30 years later? Frankly, they probably didn’t even focus on that. Rather, they focused on building the best possible system they could, and optimized the living daylights out of it.
There was a time when the future of a software application was 10 to 20 years, with 30 or 40 being the outer edge of possibilities. Imagine a world where an application for your enterprise could be running for 100 years. It’s going to happen. The question is, will yours be the team that wrote it?
What does that even mean, in terms of planning? How do you road-map for a century? The answer is, you don’t. Sure, there are a few things you do need to take into account to plan beyond the decade level. The most notable of these would be figuring out a way to solve the UNIX 2038 problem.
But beyond that, the secret to planning software for the super long haul is stuff you should already be doing: documentation, modularization, optimization. Lots of -ations, frankly.
I don’t think the Hadoop project, which Murthy was discussing, is quite ready for the 100-years level of planning, but they’re already aware that the next generation of this software will be staying put for quite some time, however it turns out.
But this made me think, quite heavily, about the potential for these new cloud operating systems and processing environments to become a much longer-lasting foundation. If Hadoop’s 1.0 branch is still being used, five years after its experimental release, Hadoop 2.0 has the potential to be in the field for 10 years. Or 50. Or 100. What will your application look like in 100 years? It’ll probably still be inside a VM, and if you’ve done everything right, it’ll still be the same code now running alongside 100,000 other newer applications that have all been layered on top.