AMD helps OpenCL gain ground in HPC space

Published: September 28th, 2011

- Alex Handy

When it comes to high-performance computing applications, OpenMP has long been the standard open-source API for the job. But with new processors and new efforts from the companies behind those processors, OpenCL [Open Compute Language] has emerged as a new challenger in the HPC space. According to Evans’ Data Corp., OpenCL is now the second most popular HPC tool, behind Intel’s Threading Building Blocks. Evans also shows OpenCL adoption has increased since 2009.

AMD is preparing a retinue of new tools for OpenCL developers, tools it hopes will help to spread the adoption of this open framework for writing heterogeneous cluster-based applications. The aim is to compete with nVidia’s CUDA tools and framework.

The most significant and recent change to both platforms has been the unification of memory space across RAM and VRAM, allowing developers to track the stored information for their high-performance compute applications without having to use two separate maps for memory.

Now that the two systems are nearing par with each other, AMD has decided to step up its game by releasing optimization and development tools that fill in the gaps it sees in the OpenCL ecosystem.

Further proof of AMD’s newfound commitment to tools and HPC came last May 2010, when Manju Hegde left nVidia to join AMD. Hegde became AMD’s corporate vice president of its products group.

“The promise of OpenCL is that you can optimize,” he said. “To give meaning to that, we’ve developed tools.” He went on to say that much of the work in HPC is not writing the functionality, but streamlining the code to be as fast as possible.

“OpenCL’s promise is that it works across platforms,” said Hegde. “It works across CPUs, GPUs and in the low-power space. It works across vendors. To give meaning to that message, the first thing we’ve done is invested in tools that allow development across CPU and GPU.”

Right tools for the job
Those tools run the gamut from debuggers to performance profilers. The gDEbugger is designed to help developers find trouble spots in their applications, and to help them identify bottlenecks. AMD Code Analyst, on the other hand, is designed to explicitly point out those bottlenecks to developers.

Elsewhere, AMD is also releasing an LLVM extension, a kernel analyzer and an application profiler. All of these tools will be released later this year, and some will be made open source, said Hegde.

But AMD’s efforts with OpenCL don’t end with the developer. Hegde said the company is also producing its own college-level course in OpenCL, and offering the materials to professors. AMD has even partnered with an education startup to solve one of the biggest problems in computer-science classes: grading projects. This startup that automates CS project grading, provided the projects are written in C or C++. Grading occurs online, and professors simply upload submitted projects to a website.

But the real proof of OpenCL is in its use in the real world. Sean Varah, CEO of MotionDSP, recently transitioned his team from nVidia’s CUDA tools to OpenCL. “To be honest, I was really pessimistic about OpenCL,” he said.

“It basically took nVidia two years to get a stable SDK and driver out. Eighty percent of my business is with the military, so my software can’t break. It’s a problem we were having with CUDA in the early days, and if my software blows up, I can’t blame anyone else.”

Sanford Russell, director of marketing for nVidia’s CUDA platform, admitted that it took time for the CUDA tool chain to evolve. “With CUDA 1, 2 and 3.0, we were filling in major pieces of the wall. With 4.0, a lot of the technology that was from last year has been geared on how to make it easier for people to get their applications ported to the GPU, and how to make it easier to program in parallel on the GPU,” he said. Thanks to those years of development, he added, the CUDA tool chain is now mature and offers the capabilities demanded by the community.

Varah said that it takes time for any tool chain to evolve, and that when he looked at AMD’s OpenCL offerings, he expected the same timeframe. “I saw a three-year trajectory there, with at least two years for AMD to get the tools stable,” he said.

“It’s one thing to port the GPU code, it’s another to optimize the hardware. We weren’t planning on porting to OpenCL for another year. But AMD came to us and said, ‘Try it out.’ So we ported our product to OpenCL.

“The initial port wasn’t that hard, but it was buggy as hell and performance sucked. We gave frank feedback to AMD, and to their credit they listened. In November of 2010, things were looking kind of grim. But AMD jumped in on three different levels; they listened to where the bottlenecks were on performance. That three-year trajectory was turned into 1.5 years.”

Varah said that OpenCL wasn’t too difficult for his team to get acquainted with. “From our side, the actual coding in OpenCL isn’t that complicated. Certainly, changing the architecture of your code to the manycore paradigm is a big architectural change,” he said.

“That is a bit of a shift, but on the other hand we kind of had to go back and start from scratch. The initial port didn’t take a lot of time. It’s really the optimization that takes time. It’s really getting up to speed on how you can optimize toward the hardware.”

All of this is indicative of AMD’s new approach to tools and developers, said Neal Robinson, senior director of global content and application support for AMD. “We focus more on broad developer outreach,” he said. AMD won’t be offering paid-for consulting services to help optimize applications, as Intel does. Rather, AMD is working with its ecosystem of partners so that third parties can offer this type of support. Additionally, the forthcoming OpenCL tools will be made available to developers for free.

AMD’s Hegde said that much of the work that remains to be done is at the compiler level. “There will be lots of compiler work because we want to always stay true to the OpenCL promises of being cross-vendor and cross-platform,” he said. “We have to go from an intermediate layer to target all the ISAs. There is tool work that needs to be done there. Our first reference compiler will be for C++.”

Additionally, he said, LLVM will be the first compiler to receive tooling from AMD. He said that the architecture of the LLVM extensions for OpenCL will be made available, and thus should allow other compiler teams to integrate support on their own time.

Article Tags

AMD, HPC, NVIDIA, OpenCL

About Alex Handy

Alex Handy is the Senior Editor of Software Development Times.

View all posts by Alex Handy

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

AMD helps OpenCL gain ground in HPC space

Article Tags

Subscribe to SDTimes

About Alex Handy

Related Articles

A new frontier in HPC with “Bring Your Own Code”

AI updates from the past week: Docker MCP Catalog, Solo.io’s Agent Gateway, and AWS SWE-PolyBench — April 25, 2025

AI updates from the past week: New OpenAI models, NVIDIA AI-Q Blueprint, and Anthropic’s Google Workspace integration — April 18, 2025

Mar 21, 2025: AI updates from the past week — Anthropic web search, Gemini Canvas, new OpenAI audio models, and more