Video, video everywhere, and not enough people to watch it. That’s the conundrum facing military and security personnel today, the people who sit in front of banks of monitors, watching hours of mind-numbingly mundane footage of people going about their business, yet must be attuned to any slight clues to a wanted suspect or potential crime.
But now, researchers at MIT and the University of Minnesota have created a new program to discern such signals from the video noise faster and more accurately than a human or existing automated system.
It’s a new type of smart surveillance system that “learns” from previously recorded video footage how to quickly scan realtime feeds and identify specific suspects. It can also flag unusual, potentially dangerous changes in an environment like an airport, such as when someone deliberately leaves behind a bag.
“The learning phase is very fast, not requiring more than a minute for the problems we explored,” wrote Christopher Amato, the leader of the effort and a postdoctoral candidate with MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).
The new system developed by Amato and his colleagues is a software program called “Biologically Inspired Scene Estimation” (BIS-E). It builds upon current surveillance systems that require users to choose between various different vision algorithms to apply to a given screen or particular scenario.
But BIS-E takes this idea another step further, instead relying on higher-level algorithms that can quickly analyze a realtime video feed, compare it to the filters that were applied to previous similar footage, and select the best option of the available set, all without human direction.
In practical terms, this means the software can automatically identify and follow specific suspects based on facial recognition technology, or sound an alarm if a scene changes in a way that causes concern, such as bags being left behind deliberately in an airport lobby or people or objects moving in ways deemed “unusual” by the system.
In fact, trials of the BIS-E system found it to be about four seconds faster at identifying people in video footage than a human expert (18 seconds vs. 22 seconds), and about two seconds faster at identifying objects (4 seconds vs. 2 seconds). It was also about 16 percent more accurate at both tasks (53 percent vs. 37 percent).
“Currently, the system can identify people with some amount of disguise, but relies on a relatively simple version of facial recognition to ensure the correct person is identified,” Amato told TPM. “More sophisticated methods can be added to the system to improve its performance in these situations.”
Not surprisingly, development of the BIS-E system was funded by a Defense contract and a contractor, Aptima, where Amato previously worked.
So far, Amato told TPM that the system had only been tested in the laboratory using two stationary cameras in a single room, but it was able to distinguish between two previously identified individuals and pinpoint one previously identified object, even when that object was “hidden in various ways and different lighting conditions.” Amora also said it could support 10 cameras and eventually many more, with further development.
Even more intriguingly, Amato and his collaborators believe that their system could be applied to any other type of remote sensing project, even for sensors that don’t use video cameras whatsoever.
“The idea is that our approach could be used with other types of sensors (visual, aquatic or otherwise), Amato explained. “If we can estimate the sensor uncertainty, we can determine a high level controller for determining which sensors to utilize and how to analyze the information that is gathered by those sensors.”
So, BIS-E could someday even be used to monitor weather conditions, such as the formation of tornadoes, according to MIT. Amato declined to answer questions about if he or his colleagues were concerned about potential misuse of their system.
Amato worked with three other researchers at the University of Minnesota to develop the first working prototype of the system, a process which took six months. The researchers will present a paper on their work at an artificial intelligence conference in Toronto in July.
Originally posted on Talking Points Memo.