Rogue Wave Software, the largest independent provider of cross-platform software development tools and embedded components for the next generation of HPC/High Performance Computing applications, today announced the simultaneous release of ReplayEngine 2.0 and TotalView 8.9.1. This suite of products dramatically simplifies debugging and memory analysis, especially for applications that are data-intensive, multi-threaded, or distributed across a network or cluster. With this release, ReplayEngine supports native Infiniband on both Mellanox and QLogic hardware, opening up its use on large-scale HPC clusters; and TotalView expands its CUDA support to include SDKs 3.1 and 3.2, with CUDA 4.0 support in progress.

ReplayEngine 2.0, the reverse debugging add-on to TotalView, now supports debugging applications that make use of the high speed of Infiniband networks, making the power of ReplayEngine available to developers working with large clusters. ReplayEngine supports native transport mechanisms with Mellanox and QLogic Infiniband hardware on MVAPICH 1.2, MVAPICH2 1.5 and 1.6, OpenMPI 1.4.2, and Intel MPI 4.0.

“Now, developers creating complex parallel applications for deployment on high performance clusters have the option of using reverse debugging on these systems,” says Chris Gottbrath, Principal Product Manager at Rogue Wave Software.  “With ReplayEngine developers can allow their program to run until the point of failure, then step backward through the program execution, making hard-to-reproduce bugs easier to find. With this enhancement, we’re making complicated development a little easier.”

GPU technology is evolving rapidly.  And, HPC developers are adopting new versions of the CUDA development environment at the same pace. TotalView 8.9.1 provides developers with the ability to troubleshoot apps using versions 3.1 and 3.2 of the CUDA Toolkit.  This release features support for CUDA function calls on the stack (in addition to inline), host pinned memory regions, and CUDA contexts.  It handles exceptions in CUDA code, displays variables in GPU hardware registers, and has added CLI (command line interface) commands for CUDA functionality.

“CUDA 3.2 provides a number of new enhancements, including the ability to call and return to functions using a stack,” said Sanford Russell, director of CUDA marketing at NVIDIA. “TotalView’s support for these enhancements allows developers to confidently write code using these advanced features to fully leverage the power of GPUs.  This includes accelerating parallel applications running on large scale clusters.”

In addition to CUDA support, TotalView 8.9.1 includes expanded platform support and enhancements to the multi-dimensional array display, parallel backtrace features and TVScript.

The Danish Meteorological Institute (DMI) is responsible for providing weather forecasts each and every day. “People are making decisions based on the results of our simulations, so accuracy is absolutely critical. We use TotalView,” says Jacob Weismann Poulsen, a staff scientist at DMI, “both interactively and as a part of our nightly validation process. With TVScript we can validate the code automatically – it allows us to debug via code comments and relieves us from having to rebuild the code again and again.”