Within the past several years, the GPU went from being supercharged graphics hardware to a medium for parallel programming. Now, NVIDIA is attempting to take GPU development to the masses with a new tool set for Microsoft Visual Studio.

NVIDIA today released a free edition of Parallel Nsight, a GPU-accelerated application development tool set that integrates with Visual Studio 2008 SP1 Professional edition. It can be deployed on Windows HPC Server 2008, Windows 7 and Windows Vista.

Nsight can be used to develop Compute Unified Device Architecture (CUDA) C/C++ applications, or applications that use Microsoft’s DirectCompute DirectX API. CUDA was created by NVIDIA to make its hardware accessible to developers. NVIDIA developed CUDA C/C++ as an extension to C/C++ for parallel programming. The CUDA C/C++ language is being taught at more than 350 universities, the company claims.

The Parallel Nsight tool set allows applications to execute across both CPUs and GPUs, but the GPUs must support NVIDIA’s CUDA architecture. Several NVIDIA GPUs are capable of running CUDA applications, said Sanford Russell, general manager of GPU Computing at NVIDIA.

GPUs have a parallel throughput architecture that can be exploited to accelerate applications in finance, graphics, cryptography and other disciplines.

Parallel Nsight packages 10 of NVIDIA’s development tools that were previously standalone, integrating build, debugging and profiling, said Russell. It plugs fully into Microsoft’s debugging engine, a feat that took NVIDIA a year and a half of engineering work to accomplish, he added.

“Research shows that developers believe the most difficult tasks when developing parallel applications are debugging, performance tuning and designing parallel algorithms,” said David Rich, director of technical computing at Microsoft. “By integrating GPU computing into Visual Studio, NVIDIA’s Parallel Nsight is transforming the way GPU-based parallel computing applications are developed for Windows.”

A Professional edition of Parallel Nsight is available as a release candidate. It adds additional functionality, including a CUDA C/C++ performance analyzer, OpenCL and OpenGL analyzers, and DirectX 10 and 11 analyzers.