Development tools are catching up to multicore

Published: May 13th, 2011

- Alex Handy

Multicore processors, long the mainstay of servers, have made solid inroads on the desktop and even mobile computers, yet the development and QA tools and processes required have historically fallen behind.

Fortunately, that’s beginning to change. And while they’re by no means on every developer’s desktop, there are tools available for multicore programming and debugging. That’s good news, because the challenges of multicore development require both training and software.

Erik Hagersten, CTO of Rogue Wave, while sells tools for multicore development, said that many of the parallel programming problems we see today were solved in the 1990s, but that those solutions required training, understanding and expertise on the part of the developer.

“Parallel processing isn’t new. We were building parallel servers at Sun Microsystems, but the difference now is multiprocessing is going to the masses,” he said.

“It’s not just the experts; it’s pretty much everybody. In that respect, we’re missing a lot. The experts are doing fine of course, but the mass market isn’t ready for this.”

And while doctoral theses on parallel programming and the potential for new tools and languages are all very compelling, they do little to help developers who need help today. “Many people are waiting for that magic thing to happen: a new language with new parallelization; improvements for the Javas and the Erlangs,” said Hagersten.

“That is happening slowly, but not fast enough, and people need to write their code here and now. The largest problem is non-determinism. How can you fix a bug if you can’t recreate it?”

The problem is even more profound in the embedded market, where non-determinism is unacceptable within systems upon which human lives depend, said Greg Rose, vice president of marketing and product management at DDC-I, which sells DO-178B certifiable embedded operating systems and tools for use in flight safety-critical avionics applications.

“To date, most [safety-critical embedded systems] customers using multicore processors are turning one of them off,” he said. “They’re using it in a single-core implementation due to all these effects. It’s really up to the software developers as to how you utilize these extra power granted to you with multicore.”

Embedded developers favor using a single core, said Rose, because non-determinism makes reliable testing almost impossible. Running an application twice under the exact same conditions can yield two entirely different behaviors due to the timing involved in passing processes to two processors that share memory and cache. Even attaching a debugger to the application can change the outcome of a process because the debugger itself alters the timing of execution.

“For determinism, you don’t want to have contention between resources. That could be memory, could be cache, or it could be the system itself,” said Rose.

“You don’t want a system acting non-deterministically. When you put in the multicore, it’s not like multiple discrete computers. It’s on a chip, with shared memory and shared cache, which increases the non-determinism. We have some software for cache partitioning, and being able to segment your cache is important. Also, there are these resource contentions that come along and cause you to have to budget more time. Even if you budgeted for worst-case scenario timing, your average execution time is still going to be nominally dependent on that.”

Thus, many solutions to the problem can cause other problems. It’s a tricky set of difficulties to navigate, especially when the underlying goal is one of improved developer productivity.

“Education is part of it,” said Hagersten. “This is not new; there were several techniques developed in the 1990s, but with those techniques you wouldn’t see the productivity.

“In the short term, I would find the best tools and environments. The problems we’re running into now are so subtle and hard: race conditions, deadlocks and non-determinism.”

The heart of the problem, said Eli Boling, Embarcadero’s manager of compiler development, is that “there’s no silver bullet for the general-purpose programmer on multicore, mostly because regardless of how many cores you put out there, it’s non-trivial to take your general-purpose application and make it parallel.

“There are too many things about them that are serial. There is a set of areas that are really susceptible to parallelization, like digital-signal processing, image processing, simulation, some physics problems where they’re doing large array calculations, or security operations where you’re looking at very large sets of data you have to work on.”

And that is where much of the current focus within enterprises has fallen: using multicore systems to process big data.

James Reinders, chief evangelist for Intel’s software products division, said, “I think people know we’re under a crush of data. We’re making things like audio, video, HD video, and we’re collecting lots of data we want to process. Well, it’s natural that people are going to want to compute on this data and find the hidden gems that make their business better.

“Parallelism is the only way we’re really going to be able to tackle this growing amount of data. We need to find parallel methods for that. Co-arrays in Fortran is a hint at a feature going for that. I feel like people are turning toward understanding the problems they need to solve, and parallelism is part of the solution, rather than just saying, ‘I want parallelism.’ ”

Tools for the job
That’s not to say that there are no tools for parallel developers. There are lots of tools. One of the best known is the Intel Threading Building Blocks, a template library that offers code to help handle and manage threads spread across multiple cores.

Reinders said that the company has been searching for additional ways to help developers deal with multiple cores. “The tools have made a lot of strides. If you take a look at C and C++ programming, you’ll see Intel introduced some aggressive tools in May of 2009 and updated them last year in September, and actually took some of these things out further.

“The more-advanced HPC tools have benefited form the new ease-of-use tools that have been added. C and C++ programmers have new capabilities. Developers shouldn’t be very confused. There’s a promotion of what we call task-based programming, instead of thread-based programming, which frees up a developer to do more abstraction in C and C++, and helps with debugging.”

Hagersten said that Rogue Wave also has tools available now and in the works that can help ease the development process on multicore systems. “If you look in the debugging environment, one of the other tools we have in our portfolio is a tool that turns a multi-threaded execution into a deterministic environment,” he said.

“It’s a debugger where you can execute forwards as well as backwards. It’s a more intelligent way of working with today’s technologies rather than sitting and waiting for the magic language.”

DDC-I also has tools for embedded developers that can help smooth the rough edges of multicore software design. “What we’re pioneering here at DDC-I is a way to help characterize and minimize, in-bound, these resource contentions, which can lead to really inflated worst-case execution times,” he said.

“Cache partitioning is one of the technologies we have here. We’re using other techniques once you’ve got these building blocks in place, and we think the underlying operating system is the key to making this work. Then you can test the application software to make sure the time budgets have been set appropriately so you’re not going to run into scenarios where, because of resource contentions, we can’t get our job done.”

Threading Building Blocks is also evolving, said Reinders. “I get very excited about the things we’re updating. They make Threading Building Blocks easier and address deployment challenges.

“One of the really cool things that got added was full support for C++ lambdas. That’s the most exciting new feature of the new C++ standard. Lambdas give a very concise way to specify ‘here’s some code,’ without having to go off and create a whole function definition for it. This turns out to be very useful for parallelism. You can say, ‘Here’s the code I want running in parallel.’ Anytime code is easier to write, it’s less error-prone and easier for people to work on that code with you.”

Non-deterministic future
The future of multicore development isn’t going to change over night. While the tools continue to improve, the ability to automatically turn a generalized program into a parallel one isn’t likely to appear on the horizon anytime soon.

“It’s definitely going in the right direction, but it’s still far away from the golden goal of automating that or making sure that hard-to-find problems won’t occur,” said Hagersten. “Tools don’t solve the hard problem of making sure the programmer understands how to get performance from the program. I still wouldn’t say we are even close to the productivity we had even 10 years ago. We are used to improving productivity all the time, but now we’re going backwards.”

Reinders said that one of the more interesting areas of research for parallelizing big data at Intel has been around Fortran. “Fortran has been making some equal strides,” he said. “It may surprise some people that have had solutions like OpenMP and MPI, but there’s been some interesting things going on in Fortran. Co-array, for Fortran developers, is really cool, but it hints at a trend we’re going to see spill over into other languages. It’s designed to handle big data. This is really a topic unto itself.”

“I think it’s unlikely it’ll ever turn out to be an automated process,” said Embarcadero’s Boling. “I think the various research projects will bear fruits about language constructs that turn out to be useful for helping people do functional programming, or building applications in a style that tends to be more susceptible to automatic compilation.”

DDC-I’s Rose said that the future of multicore in embedded systems could take an entirely different route than that seen on servers. “We’re looking at asymmetric multiprocessing, where you have a complete copy of your OS running on each core, and you hard-schedule each one of those different cores. What that does is aids in the predictability.

“Imagine if you’ve got cache issues in non-multicore. Imagine if there’re other [areas] where you’re trying to do multi-processes on multicores. The non-determinism can actually get worse. Where we’ve seen our customer’s interest is where they have rigid time and space partitioning, where they partition and say exactly this much time is available to each task, and the one task can’t corrupt the other task. They want hard control over what’s running on the cores to make sure they can completely characterize the effects of determinism. The more you constrain the device, the more you can characterize and put together your schedule.”

That future of running multiple operating systems in tandem on multicore embedded systems, however, is not here yet, said Rose. While servers have long been virtualizing operating systems and cramming multiple machines worth of applications on a single box, the embedded world remains beholden to the reliability that only a single processor core can bring.

Article Tags

multicore

About Alex Handy

Alex Handy is the Senior Editor of Software Development Times.

View all posts by Alex Handy

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

Development tools are catching up to multicore

Article Tags

Subscribe to SDTimes

About Alex Handy

Related Articles

Checking in with OpenACC

The Trouble with Gerrold: Using all the petaflops

Weaving solutions for multi-threading

The Trouble with Gerrold: Predictions, predictions, predictions