Kinect to your applications

Published: January 10th, 2012

kinect
It is an exciting time to be in the world of software development, yet by the same token, it is a scary time. The pace of change seems to be ever-increasing, and things that were stable for a long, long time are experiencing revolutions. User interfaces and user interaction modalities are no exception and may be the areas that end up causing the most turmoil for development teams in the years ahead.

The question used to be whether an application was best suited to Windows, which required an installer, or if it would benefit from an installer-free Web application format. We still get to make that decision, though it is further complicated by client technology choices, but now there are extra dimensions that complicate decisions far more than extra choices. There is a batch of technologies pushing the possibilities of interface design to the next level.

If HTML5 jumped to mind, then you are on the tame side. While HTML5 does promise to bring media and game-like interfaces to the browser, it is not going to be nearly as disruptive as products like the Microsoft Surface and Kinect. They represent the wild side of things because they embody moving away from the keyboard and mouse input as the primary interface to the application. These are not just for games, as we will see by the examples of where these technologies are being leveraged already. (If you have not heard of the Surface before, in its first version, it is a coffee table-sized and shaped computer screen that lets users interact with it via as many as 20 discrete, simultaneous touch points.)

There is a strong temptation to consider these as amusing niche technologies that do not have any bearing on the future of business applications or even consumer applications beyond games. For another year or two, that is probably a low-cost strategy, but it assumes that this is not the direction of things to come, and that is a mistake. The danger for developers and project managers is that there is risk in adopting these new technologies immediately and there are risks in not embracing them early enough. If you ignore them, the world can—and likely will—pass you by. If you embrace them and try to just wedge them into your old designs, things are not going to work. The trick is to understand where they fit and how they can be used effectively.

Reach out and touch
The first of the next-generation interfaces that everyone would agree has fully infiltrated society—and done so seemingly with blinding speed—is touch interfaces on smartphones. It is easy to overlook this development since it seems to be niche and suited to purpose rather than the start of a change in the bigger picture. All of the smartphones out on the market that are vying for dominance use the touch interface made popular by Apple with the iPhone. Apple did not invent touch as an interface of course, but it certainly brought it mainstream.

Now, flick and pinch are part of the general computing vocabulary. If that last sentence contains two words that have no meaning for you, and English is your primary language, then you must get up to speed with touch interfaces and jargon immediately. Step one is to get a touch interface phone most likely running an OS from Microsoft, Apple or Google, and then use this guide to explore the touch interface. Even if you know what flick and pinch mean, you might find the guide useful.

The mistake is to assume that this revolution ends with phones. We are already seeing that tablets are participating in this touch revolution, and the coming Windows 8 has already sworn allegiance to supporting touch in a big way at its core. It is not just for games and it is not just for phones. If you have trouble imagining how this will spread to business apps or productivity applications in general, then join the club, but that is all the more reason to get ahead in understanding the potential. As you will see, these are some of the places where next-generation interfaces are already making inroads. Computers beyond tablets are the next step in the progression, and Microsoft’s Surface, which will soon be released as version 2, embodies where this interface can go.

If Microsoft appears to be playing catch-up in the next generation interface space because of the early success of others in the phone market, then you might be surprised to learn that it is actually a pioneer. While Apple unveiled the iPhone in January 2007, Microsoft started on the Surface in 2001.

I got the chance to talk to Bryan Coon, lead software engineer at InterKnowlogy, about the Surface and Kinect development. InterKnowlogy is one of the companies leading the charge to bring these technologies to business applications, and Coon has worked on a number of very interesting projects.

He said the Surface works well with “attract attention-type apps” because they grab attention and support from multiple users all poking and prodding from all angles. In the computer world, Surface produces something akin to the street performer effect where people want to walk up and play with it. Surface makes you want to touch it and interact with it. I have been around Surface for a while now and admit that is still my first inclination when I see one.

The primary interactions with the Surface are based on touch, including flicks and drags, though keyboards are typically on-screen and appear as needed. Coon said the InterKnowlogy approach is to “envision groups working with it together.” Touch is demanding for a developer, but very satisfying to the user when done well.

The Windows 8 Metro interface is instructive in this regard in ways I had not considered prior to the Build 2011 conference last September. Microsoft has put a lot of thought into how touch changes the game beyond phones and the kinds of applications that work—and the kinds that do not work—with touch. A key takeaway is that performance is critical in any touch-enabled system or application because the interface must react fluidly and instantaneously. This has driven many of the decisions that shaped the iPhone and subsequent iPad experience. The problem is that Apple was not at all transparent on why they did not allow multi-tasking and other tradeoffs meant to ensure responsiveness.

With a touch interface, any freeze in the UI—even for a second—is user-confounding and a deal-breaker. This is why Windows 8 development makes such a priority of asynchronous tasks, to prevent the user interface from blocking.
#!
Building from the ground up
The development tools and the infrastructure are catching up quickly to help us build touch-enabled applications. For example, software development tool vendor DevExpress is committed to touch for all products, supporting even a touch-capable grid control.

One of the key challenges is to not force touch support into a project by shoehorning the functionality into existing interfaces, but to take advantage of the new capability to enrich the experience and make using the application easier. InterKnowlogy bundled some of its experience into a scatter control that enables multi-touch (provided of course the hardware supports it). It can be found here. Knowing what tools are available is an important part to being in a position to use these new interfaces when the opportunity arises. A really great example of leveraging the strengths of the Surface is InterKnowlogy’s Warehouse Commander application, which takes data from Microsoft Dynamics and allows a shipping warehouse to optimize bin placement for greater efficiency, all with a very visual, touch-enabled application. You can see a full demo of that program to get a glimpse at this line-of-business application that leverages the strengths of the Surface quite well.

Touch interfaces are cool, but they have their limits. When you look beyond touch interfaces, things get a bit less defined. For example, swiping a finger across a display or interface-enabled surface is clearly touch, but what about putting an object on a tabletop device? This is a common element of the demos we see with future interfaces, and the Surface even has a tag system to support this kind of interaction. Then there are waves, pointing, and any number of other motions that are referred to as spatial gestures. In techie circles, we tend to call these “Minority Report”-style interfaces after the Steven Spielberg movie starring Tom Cruise that depicts characters using interfaces that are just one step short of “Star Trek’s” Holodeck.

The more correct term is to refer to it as a NUI (Natural User Interface), and the Holodeck is the best example of where things could be heading, though probably not in my lifetime. The common theme when NUI is discussed is invisible, or non-intrusive. By that definition, the Kinect gets us most of the way there. By this definition, the goggle screens with gloves from years past also fit the NUI category technically, but they fail on the non-intrusive point. Microsoft gets it, but as always, it needed competitors to force it down the road.

In fact, right around the time Microsoft was prototyping the first Surface device, Spielberg worked with Microsoft to help understand futuristic interfaces while making “Minority Report.” As previously mentioned, this film is widely referenced as the poster child for NUI interfaces, with depictions of the protagonist plucking virtual objects in a virtual reality interface (i.e. performing spatial gestures to control the system). Spatial gestures in the Microsoft world are the province of the Kinect, which will be discussed in detail later in the article.

Neil Roodyn, director of nsquared, talks about a system he has worked on where the user is immersed in a building and can manipulate fixtures, including swapping out doorknobs and moving lamps and other furniture with gestures. The system he displays supports touch through tablet integration, but it also goes beyond that to what he calls “vision systems.” By this, he means a system that understands the context on the environment as a person seeing it might—not only knowing that something is placed on a touch surface, but what was put there and by whom.

During a talk that Roodyn delivered, he used the Kinect as the provider of a rich vision system for accomplishing that environmental understanding thanks to the three cameras, an infrared array and a microphone array it provides. During that same presentation, he pointed out a pretty amazing video that Corning produced over a year ago titled “A Day Made of Glass.” It showed where the company imagines it could take its products in a way that leverages the touch interface. You can see it yourself.

The Surface is a different market than the other technologies discussed. It costs thousands of dollars, making it very expensive when compared to the Kinect or even a touch-enabled tablet, but it was never meant to be a mass-market device. Conversely, the Kinect was really never meant to be a general-purpose input device.

Kinect, a game-changer
The Kinect is a whole different animal, and is likely a major game-changer in much the same way the iPhone was for touch going mainstream. The entry level for the user on the Kinect is much lower, with a price point under US$200.

It was originally called Project Natal, and I first heard about it when someone pointed me at a video that showed it as the answer to the Nintendo Wii game system, but raises the bar by taking the controller out of the picture. It is turning out to be a game-changer that is surprising everyone, including Microsoft. It one-upped the Wii’s motion-based interface, but added the concept of not requiring a controller of any kind.

Nintendo certainly deserves some of the credit for putting us on this road, as Microsoft does its best innovation in response to a competitive threat, and the Wii control and balance board set the stage. The Kinect was dubbed the “fastest-selling consumer electronics device” after selling 8 million units in the first 60 days.

It was not long before developers started to hack at the device to figure out how it worked and how it could be leveraged outside of the Xbox 360. In fact, there are all kinds of videos and guides from about a year ago showing how to rig up the USB-like plug from the Kinect so that it could be connected to a PC USB port and be supplied external power. This all seemed to catch Microsoft by surprise, and since it all actually violated the terms of use, there was an expectation that there might be a backlash from Microsoft.

Rather than clamp down on these early innovators, Microsoft adapted by developing a plan to create and release a Windows SDK to let developers put the Kinect to work for Windows applications. The Kinect for Windows SDK, along with resources such as tutorials, is available at www.kinectforwindows.org. The current set of bits supports Windows 7 and the Windows 8 Developer Preview, and is for non-commercial purposes with the commercial version promised in “early 2012.”

We will discuss what exactly the SDK provides in a bit. There is one small bit of hardware beyond the Kinect sensor bar itself that you will need if you want to embark on playing with the Kinect SDK, and it has to do with the USB-like interface of the Kinect sensor. You do not have to break out your soldering iron to try it out because Microsoft and third-party provider Nyko have made available an adapter that lets you plug the Kinect into older Xbox 360s that do not have the proper connector for the Kinect.

I suspect that the vast majority of people buying these today are using them to adapt the Kinect to a PC for programming purposes rather than for older Xboxes. The adapters also provide power via a plug since the Kinect plug powers the sensor as well. That is a great deal for $25 to $35 since you can avoid messing with trying to build your own.

The Kinect for Windows SDK still has Microsoft Research’s fingerprints all over it. That is by no means a bad thing since the most jaw-dropping, cool things from Microsoft typically have their start at Microsoft Research. The first thing you have to do when starting a project that leverages the Kinect is to add a reference to the Windows.Research.Kinect DLL. The SDK comes with samples that are a great help in getting started, including the Shape Game and the Skeletal Viewer.

The source code for these two samples will be invaluable to jump-starting your understanding of how to make use of the capabilities provided by the SDK. The latest version of the Kinect for Windows SDK provides access to raw data streams from the depth sensor, color camera and the microphone array that consists of four microphones. This is the source of the ocean of data alluded to in my conversation with InterKnowlogy’s Coon.

One of the key functions is skeletal tracking, since whole-body tracking is a great way to provide control, and both of the sample programs make good use of it. Many of the improvements to the latest version are to the skeletal tracking system, including a speed and accuracy boost. There is also now support for losing connectivity with the Kinect without losing everything (a problem with the past version).
#!
Design moves at start
The ultimate question, as nsquared’s Roodyn put it, is “whether this is evolutionary or revolutionary.” I agree with his position that it is in fact revolutionary, which means that you cannot just forklift these interfaces into your old applications. That is the true hazard, since classic interfaces will be viable for the foreseeable future, but murdering the user of NUI interfaces will cause much more frustration for all involved.

When developing a system, the time to include these interface mechanisms in the application is at the very start of the design phase. Every week there is a new video available that shows how people are leveraging Kinect, and they seem to fall into two categories: those that seem like they just are using voice or gesture to do what a mouse and keyboard could do, and those that accelerate the user experience by making use of the extra dimensions.

As a user, I know that I want the system to intuit as much as possible, but no more. I want it to demand natural gestures rather than make me learn an abstract motion vocabulary. For example, if I am determining whether I want the system to continue to feed me data or stop, I think the gestures used in the card game blackjack for “hit” or “stay” would be easy to grok.

Failing to implement this might make the user raise his or her hand for more data and lower it to stop. For the second, there is no basis for remembering the commands, and if I need a lot of data, I am likely to get tired holding my hand up for a while. Either way though, it is this thoughtfulness that is needed by developers and designers that is annoying when not applied to a GUI application, but disastrous with these next-generation interfaces.

If we look at how the Kinect is being used we see it in some very innovative applications. For the medical field in general, the Kinect is huge. Some implementations are based on the Xbox 360, such as at the Royal Berkshire Hospital in England, which is using off-the-shelf, Kinect-enabled games to work on patient balance, coordination and movement in general.

For a more custom implementation that works on for rehabilitation, InterKnowlogy uses the PC with the Kinect, and bases its code on the Windows Kinect SDK, which is the most interesting way to go, in my opinion. Coon mentioned an application he referred to affectionately as the “rehab-o-matic application,” which helps people recover from surgery.

After surgery, there are a number of checkups to see how the patient is progressing. The idea is to use the Kinect to let the patient rehab at home and measure the progress, which is easier on the patient and saves money. The system can even report back to the doctor on how the patient is doing. InterKnowlogy has posted a video showing how it works and talking about some of the challenges in the development at that is worth checking out.

One of the coolest implementations for doctors to use while in surgery is by Tedesys in Spain, which allows them to manipulate scans without having to touch anything that is not sterile (or anything at all for that matter). I think it is safe to say that the medical field will likely be a leader in leveraging the Kinect in the next couple of years.

Think before you move
There are so many examples of applying the Kinect. I asked Coon if he found that the Kinect is only for projects that cannot be done some other way. He said, “Certain things it does very well, and some others not so well. A common mistake is to try to drive an old-style app with Kinect without rethinking it. The Kinect is not just a new mouse.”

Xbox 360 games have taken many different strategies to adapt to Kinect. For example, Dance Central has done it the right way, according to Coon, because it has limited the controls on screen to make it more of a wizard style. This limits distractions for users and lets them navigate obvious paths. Immediate and obvious feedback is critical, since it is a movement-based interface. One way to think about it is that it is like the old text-based games, where the user has a few choices to make and based on that choice is offered two to four new choices.

There is no expectation that you will drop what you are doing and start your current project over again with these new technologies. However, it is a good time to pay attention and look at where they might fit in your arsenal of tools. Keeping up with new development tools is par for the course, and you should treat these next-generation interfaces in the same way.

Opportunity favors the prepared mind, and if you are anything like me, you will need a good amount of time playing with these things, especially the Kinect, before you can leverage them in the right way for a real project.

Article Tags

Kinect, Microsoft

About Patrick Hynds

Patrick Hynds is a Regional Director for Microsoft and president of CriticalSites.

View all posts by Patrick Hynds

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

Kinect to your applications

Article Tags

Subscribe to SDTimes

About Patrick Hynds

Related Articles

AI updates from the past week: OpenAI Codex adds internet access, Mistral releases coding assistant, and more — June 6, 2025

AI updates from the past week: Anthropic launches Claude 4 models, OpenAI adds new tools to Responses API, and more — May 23, 2025

Microsoft Build: GitHub Copilot coding agent, Azure AI Foundry updates, support for MCP, and more

Microsoft reveals upcoming changes to Microsoft 365 Developer Program