News Facts
• Syncsort, a global leader in data integration acceleration and data protection software solutions, today announced plans to contribute an external sort “plug-in” to the Hadoop open source community as part of the company’s commitment to help make the Hadoop framework more robust and easier to use.
• Accelerating MapReduce processing and improving the performance of the standard Hadoop sort benchmarks are widely seen as areas where domain experts can play a significant role in enhancing Hadoop and unlocking value for the entire community.
• Syncsort’s contribution seeks to enhance the sort framework in Hadoop for all users by making it more modular, flexible and extensible. Organizations simply use the “plug-in” to bring their existing investments in sort technology into Hadoop regardless of whether they are using Syncsort’s DMExpress or another solution.
• Additionally, Syncsort is announcing a special DMExpress Hadoop Edition of its record-setting data integration acceleration software that will include Hadoop Distributed File System (HDFS) connectivity and the ability to create jobs in DMExpress’ graphical user interface and run them in MapReduce.
• However, what sets the solution apart is its ability to dramatically improve performance in MapReduce by shifting transformations to the DMExpress engine and utilizing powerful new Hadoop accelerator technology that Syncsort is bringing to market.
• The Hadoop accelerator will make use of Syncsort’s “plug-in” contribution to the Hadoop community to seamlessly improve the performance of MapReduce jobs through sort, and will invoke high performance compression as needed to deliver significant storage savings.
• DMExpress Hadoop Edition makes MapReduce processing more efficient by providing a simple, self-tuning alternative that dramatically enhances performance and facilitates ongoing development and maintenance.
• The solution takes advantage of DMExpress’ light install and resource footprint to enable seamless deployment on all nodes in the Hadoop framework.
I• nitially available as part of a limited beta program, DMExpress Hadoop Edition will be generally available later in the calendar year.
Applying 40+ Years of Performance Expertise in Data-Intensive Environments to Hadoop
• Hadoop Acceleration – DMExpress Hadoop Edition features proprietary sort algorithms and transformations that optimize Hadoop. Elapsed processing time for existing Hadoop jobs has been reduced by up to 40 percent in TeraSort benchmarks.
• Greater Efficiency – The solution significantly reduces resource utilization, including CPU, memory and I/O, while improving scalability for less hardware requirements and associated costs.
• Easier to Use and Maintain – DMExpress Hadoop Edition requires no tuning, no coding and no MapReduce scripting to significantly increase IT staff productivity and enable greater focus on strategic projects.
Accelerating Hadoop Processing by 2x in Testing at comScore
• comScore, a leader in measuring the digital world and preferred source of digital business analytics, has built and defined a market by leveraging ‘Big Data’ to help its customers succeed.
• The company monitors, collects and analyzes more than 20 billion records a day, amounting to terabytes of information, to provide unique insights about users online and offline behavior.
• An existing DMExpress customer, comScore engaged Syncsort to accelerate Hadoop processing and, in benchmark testing, achieved 2x faster performance with DMExpress without additional hardware and with minimal coding and tuning.
• The benchmark testing was completed on a 6 node cluster on Cloudera’s Distribution for Hadoop Version 3 (CDH3) and involved terabytes of data.