Divergent views over supplemental compute
January 23, 2013 —
(Page 1 of 2)
Related Search Term(s): Intel, nVIDIA, supplemental compute
After five years of work on CUDA for supplemental compute, nVIDIA is squaring off with the biggest chip-maker in the world: Intel. These two titans of processing cores are both pushing PCI-X-based methods of expanding the compute power of desktops, but their respective approaches and solutions couldn't be any more divergent.
The new Intel Phi co-processor is quite different from an nVIDIA GPU. The Phi is a MIMD (Multiple Instruction, Multiple Data) machine, while the GPU is a SIMD (Single Instruction, Multiple Data) machine. The Phi is useful only for HPC, while the GPU can run games and graphical simulations as well. The Phi runs its own Linux and presents as a cluster, while GPUs are treated as on-device equipment and managed alongside the CPU.
But beyond the technical differences, the two companies are already showing that they have completely different approaches to the growing HPC market. Intel has focused on bringing its existing compilers and tools to this new processing platform, while nVIDIA has spent the past five years building a community around its CUDA platform, and around GPU compute in general.
Ian Buck, general manager of GPU at nVIDIA, said that “Fundamentally, for HPC, it's about expressing the parallelism. These accelerators are designed to process this stuff in parallel.” To that end, developers using an nVIDIA GPU to run their code can generate a single thread, and have it replicated across the GPU cores into the tens of thousands of threads. He said this parallelism is easy to get with CUDA.
“If you try that on 60 cores on a chip, it's a little less clear how you program it,” said Buck, referring to Intel's Phi. “You can treat it like an MPI (message-passing interface) cluster on a chip. The challenge of that programming model is each core is wimpy on its own: It's a Pentium Pro-type processor.”
But James Reinders, director of software products and multicore evangelist at Intel, said this is a benefit of the Phi, not a hindrance. Using the MPI programming model to treat the Phi as if it were a Linux cluster makes this desktop HPC environment familiar to existing HPC developers, who have been using MPI for some time.
“I think of it as SMP [symmetric multiprocessing] on a chip,” said Reinders. “You've got a collection of processors, a cache-based architecture with vector units, all brought together with extremely high-performance interconnect. What we've done is put it on a single chip. The benefit you get from that is that applications that have been written in MPI or OpenMP should see a very similar environment with Xeon Phi.”
And as for the belief that Phi is a collection of Pentium Pros, Reinders put that to rest. “First of all, they're not Pentium Pros. They are in-order execution cores,” he said. “The Pentium Pro was our first out-of-order execution core. We've said in the past that it's essentially a Pentium core, but the problem with talking about it like that is that Pentium cores didn't have vector units. We have very wide vector units: 16 floats wide. It also has four threads per core. It has 64-bit support, machine exception handling, and power states. The Pentium had none of those.
“The only reason we ever mentioned it was sort of like a Pentium was to get people thinking about the in-order execution. It's not as high-performance per thread. But it's a better solution to overall power consumption.”