Computers keep getting smarter, and the amount of things they can detect about humans keeps growing. A new algorithm developed by computer scientists Hamed Pirsiavash (a postdoctoral scholar at MIT) and Deva Ramanan (a computer science associate professor at the University of California at Irvine) can decode human behavior from video.
While action-recognition algorithms have already been developed, Pirsiavash and Ramanan said their algorithm provides several advantages over others.
The new algorithm has faster execution times, makes better guesses about partially completed actions, and can handle video steams of any length, they said.
“We focus on the task of action classification and segmentation in continuous, real-world video streams,” the two wrote in a paper, “Parsing video of actions with segmental grammars.” “Much past work on action recognition focuses on classification of pre-segmented clips. However, this ignores the complexity of processing video streams with an unknown number of action instances, each with variable durations and start/end times.”
(Related: Another MIT Project, which can track people through walls)
This new approach to video analysis uses techniques from natural language processing to allow computers to efficiently scan video for actions.
“One of the challenging problems they try to solve is, if you have a sentence, you want to basically parse the sentence, saying what is the subject, what is the verb, what is the adverb,” Pirsiavash said in an MIT News article. “We see an analogy here, which is, if you have a complex action—like making tea or making coffee—that has some subactions, we can basically stitch together these subactions and look at each one as something like verb, adjective, and adverb.”
The relationship between subactions is like rules of grammar, according to the computer scientists, and for any given action the algorithm must learn a new grammar. To do that, it relies on machine learning.
Pirsiavash is mainly interested in the algorithm being used in medical applications. It could potentially detect improper executions of physical-therapy exercise, and also detect if an elderly patient forgot to take his or her medicine.
The new activity-recognition algorithm will be presented at the Conference on Computer Vision and Pattern Recognition in June.