No one could have predicted the topics of discussion at this year’s Predictive Analytics World. The conference, held in San Francisco March 14 and 15, focused on the business applications of analytics, and the possibility of predicting the future with data models and information. But despite these lofty goals, the current state of the analytic art appears to be closer (in size) to Delaware than Texas.
Instead of focusing on machine learning and model design, much of the PAW conference focused on immediate practices that can bring analytics to bear on business problems. Thanks to modern analytics solutions from companies like Fuzzy Logix, Google and SAS, developers and business analysts are now able to focus on extracting meaning from data, rather than just focusing on getting that data into shape.
Toward this end, PAW talks focused heavily on in-database analytics. Partha Sen, president and CEO of Fuzzy Logix, said that this is the new way of doing things. Previously, analytics packages required data to be pulled out of a database, manipulated, analyzed, and then merged back into the database.
That way of doing analytics is over, said Sen. “What is happening is there is an explosion of data. Ninety days of weblogs is 2 or 3TB of data. Simultaneously, people were using technology that was old, and that was not really built for this,” he said.
“Obviously there is flux, and the flux is caused by the volume of data. Simultaneously, people want to do more, more and more with that data. At the end of the day, it’s just the right time for people to think about it.”
Bill Franks, chief analytics officer for Teradata, spoke at the conference about this very topic. “In a few years, if you talk to an upcoming analyst, you’ll say, ‘In the old days, we used to make copies of the data and run analysis on it,’ and you’ll laugh about how silly that was.”
Analytics life cycle
But running analytics inside of databases is not the only shift taking place in the world of data. Previously, niche areas of analysis have recently become hot topics, thanks to social media sites like Facebook and Twitter. Text analysis, for example, has been around for years, but has only recently become widely appealing, as corporations pour over thousands of tweets and blog comments that mention their company and product names.
Steve Ramirez, CEO of banking customer relations consulting firm Beyond the Arc, spoke about some of the problems encountered by businesses hoping to implement policy changes based on social media commentary. He said that just getting the data and doing the analytics is the easy part. The tougher challenge is making sure the enterprise actually acts on that data.
“First, really look at the ability to set up a shared analysis workspace. Many of the tools are oriented toward the mindset of a single analyst working on a single problem from beginning to end,” said Ramirez.
“This first piece, creating the tools, understanding how we’ll approach analysis, is a key element.”
Once the tools are set up and shared, a workflow through the business must be created so that this information is actually used by the people who can make decisions. Discovering that customers hate the speed of their ATM transactions does nothing to help a bank if the VP in charge of ATMs can’t take immediate action based on that information, said Ramirez.
“First we tried to figure out how to approach the problem by getting the quickest, fastest, most significant impact,” he said, discussing the case of a bank client. “We found the first thing we could do is identify products being discussed, right off the top.
“Second, we were able to pick off the really detailed issue. We could tell there was an ATM out of envelopes faster than we could tell if there was a deposit issue with the machine. We found in our model it was easier to find the detail and work our way back up. We then went to refine them, and then we would apply some additional layers of statistical analysis. Finally, we combined all these different inputs into the model of models.”
Those workflows aren’t just limited to the final data. Creating models for analysis is closer to software development than most other business efforts, and thus, an analytics life cycle is required to keep data models up to date and to evaluate their effectiveness over time.
Tapan Patel, product marketing manager at SAS, used to be a product manager, so he is familiar with application life-cycle management. He said that analytics life-cycle management is quite similar.
“As part of a modeling life cycle, there has been a lot of collaboration between analytics business users and IT. When the model is already in production, you also need the businessperson to tell us which new variables I need to look into. Where is your business going? What are the new areas you want to target? How mature is my market? Those kind of things effect the life cycle,” he said.
“Some of our customers are saying, ‘Look, we have extracted and fine tuned our models the best we can. What variables should I look into outside of my traditional realm which will effect my outcomes?’ ” said Patel.
Voluntary participation and contribution
Those outside variables are what make the world of predictive analytics so interesting at present. Andreas Weigend, formerly chief of analytics at Amazon and now a freelance consultant, said that social data, not social media, has fundamentally changed the game for all businesses.
“Why do people tweet? Because they think people are following them,” he said. “Facebook has even more of a feedback loop, but they want you to think you have an audience. Why do people share?
“I think it’s our way to express identity. It’s no longer the Coach bag. It’s the photos we take. It’s a way of expressing ourselves. Why do we do this? Because we hunger for attention. This is not about social media, this is not about the latest Twitter strategy for your company. It’s about social data.”
To demonstrate this, Weigend discussed successful social data programs, such as Nike’s efforts to allow runners to track their exercise regimes and share the geolocation data associated with their workouts. He also cited the work of 23andMe, a DNA sequencing company that is allowing users to share their own genetic makeup with the world.
This, said Weigend, is data that scientists would take years to aggregate, and yet, here people are sharing it with the world, for free and in the thousands. And this, he said, is why analytics is such a hot issue.