Developers working on data mining projects should build Iron Man, not R2D2. That’s according to speakers at the second annual Predictive Analytics World Conference, which ended yesterday in San Francisco. The conference focused on building better tools for analysts rather than building autonomous robots that make decisions.
John F. Elder IV, founder and CEO of Elder Research, said the U.S. government has been an early adopter of text-mining technology. “It turns out that government agencies are taking the lead in text mining. Their data is huge, their need is great and they are willing to act on it,” he said.
Text mining is the act of extracting information from streams of text.
“Text mining is a lot like the Wild West right now, like data mining was a few years ago,” Elder said, explaining that the market for text mining was still in a fledgling state. Elder Research has contracted with the U.S. government to implement data mining solutions.
Elder said one of the key problems with text mining is that many projects attempt to automate the entire process. He said this is the wrong way to deal with data mining.
“Anyone who works with computers and humans knows their strengths are complimentary; they are not alike,” he said. That means developers shouldn’t build analytics robots, but rather exoskeletal systems, metaphorically similar to the superhero Iron Man, that can enhance the comprehension and usefulness of the statisticians who use those systems.
Using predictive analytics, Elder implemented a system for the Social Security Administration that sped up the process of approving disabled applicants for insurance. For 20% of those applying for disability insurance, the process is fairly straightforward, but due to the complications introduced by the other 80% of applicants, everyone had to wait.
Elder’s system allowed the 20% to be instantly approved if their applications were similar to others that were worthy of the express lane. The system, he said, was 90% to 95% accurate in predicting which applications could be quickly approved, and that this was accomplished by analyzing the words used in the application.