Microsoft is preparing to provide Hadoop, a Java software framework for data-intensive distributed applications, for Windows Azure customers.
Hadoop offers a massive data store upon which developers can run map/reduce jobs. It also manages clusters and distributed file systems. Microsoft will provide Hadoop within a “few months,” said a Microsoft executive who wished to remain anonymous.
The technology makes it possible for applications to analyze petabytes of both structured and unstructured data. Data is stored in clusters, and applications work on it programmatically.
“They are probably seeing Hadoop adoption trending up, and possibly have some large customers demanding it,” said Forrester principal analyst Jeffrey Hammond.
“Microsoft is all about money first; PHP support with IIS and the Web PI initiative were all about numbers and creating platform demand. If Hadoop support helps creates platform demand for Azure, why not support it? Easiest way to lead a parade is to find one and get in front of it.”
Microsoft’s map/reduce solution, codenamed “Dryad,” is still a reference architecture and not a production technology.
Further, AppFabric, a Windows Azure platform for developing composite applications, currently lacks support for data grids. Microsoft has experienced difficulty in porting Velocity, a distributed in-memory application cache platform, to Windows Azure, because Velocity requires administrative privileges to install, the anonymous executive told SD Times.
“Do they feel so ‘way behind’ that they are rolling out a Java-based product without a .NET-based ‘superior’ alternative ready to go?” asked Larry O’Brien, a private consultant and author of the “Windows & .NET Watch” column for SD Times. “Perhaps they feel that distributed map/reduce is not really all that important, that they can put Hadoop on the ‘check-off box’ and it won’t be embarrassing that it gives Java developers a capability that .NET developers don’t have?”