Do you want your photo tagging sites like Flickr to better recognize cats and dogs? MIT’s research lab, the Computer Science and Artificial Intelligence Laboratory (CSAIL), is working on it.
In December, CSAIL researchers will present a new way of using machine learning at the annual Conference on Neural Information Processing Systems. Built on a new object-recognition algorithm, this process will enable semantically related concepts to reinforce each other by examining the co-occurrence of tags on Flickr.
(Related: MIT researchers create software that automatically repairs itself)
MIT News’ Larry Hardesty wrote that CSAIL researchers’ object-recognition algorithm would be able to recognize and learn to weigh the co-occurrence of the tags “dog” and “Chihuahua” more (because they are semantically similar) than “dog” and “cat,” which would have fewer co-occurrences.
The conventional way to make a machine-learning model would be to use just the data associated with a particular category. CSAIL researchers found that their algorithm was able to better predict the Flickr tags of users by using semantic similarities in a set of data, which allowed for more categories to be predicted, than through a conventional training strategy, Hardesty wrote.
The algorithm can go through Flickr images and determine tags that co-occur. For instance, the algorithm would recognize a picture of a landmark tagged with “travel,” “building,” and “museum,” and use the co-occurrence of tags “museum” and “building” to identify semantic similarity.
The CSAIL researchers will test their algorithm to see if it can identify visual features that relate to particular Flickr tags. It would be penalized for any tag that failed to be predicted.
The difference between MIT’s system and a conventional machine-learning system is that the new system gives the algorithm partial credit for incorrect tags that are semantically similar to the correct tags.
The problem with partial credit for incorrect tags is that it involves complex calculations that go beyond simply true or false predictions, which is why the researchers are using a metric called the Wasserstein distance, which is a function that compares probability distributions.
A similarity between words might be more logical than a system that searches online for exact keywords. For the future, CSAIL’s researchers hope to test their system using ontologies standard in machine-vision research, which could help with better object recognition.