DARPA’s VIRAT: Video Search, With a Twist
Updates re: Kitware’s win. (Sept 1/10)
The proliferation of UAVs and fighters equipped with stabilized, high-magnification video pods and imaging radars has a number of corollary consequences. Bandwidth has become a key battlefield constraint. Specialized reconnaissance fighter aircraft are a dead concept. And some poor analyst has to sift through the video tsunami at the other end, in order to find items of interest.
The USA is using a number of approaches to help deal with the flood, and one unconventional approach involves a DARPA project called VIRAT (Video Image Retrieval and Analysis Tool). It doesn’t recognize faces, perform before/after analysis, or rely on rewinds. Instead, it aims to distinguish certain types of behaviors, so it can provide alerts to intelligence operatives or ground forces during live operations.
DARPA has this to say about what VIRAT is, and isn’t:
“First, VIRAT seeks to enable military analysts to establish alerts that continuously query a real-time video stream. Generated alerts will cue analysts to dangers or opportunities in real-time and will provide actionable information even as events are unfolding on the ground during a combat mission. Second, VIRAT is developing tools that will put an entire petabyte-scale video archive at the fingertips of a military analyst, allowing them to very rapidly retrieve, with high precision and recall, video content that was gathered previously… The fundamental enabling technology being pursued under VIRAT is the development of representations of actions and activities that are robust…”
That last sentence is key. VIRAT is about recognizing and reporting actions: someone has gone into a building, shooting, accelerating in their car, a group meeting is going on, etc. That goes beyond simple video rewind capability, all the way to providing the play-by-play descriptions. Chick Hearn, call your office.
Now, here’s what VIRAT is not:
“…The primary focus of VIRAT is activity-based and dynamic information… The VIRAT program will not support the development of new algorithms for tracking, moving target detection and indication, image-based change detection, geo-registration, motion pattern learning, anomaly detection, and sensor fusion. While it is expected that such algorithms may be useful to VIRAT, the system will use existing capabilities in these areas… Face recognition, gait recognition, human identification, or any form of biometrics will not be funded or used in any way within this program.”

Advance warning from DARPA: this won’t be easy:
“The focus of VIRAT is down-linked aerial video, which should be carefully taken into account by proposed approaches. Spatial resolution is, at most, 10cm ground sample distance and more typically 20-30cm. The sensor is moving rapidly and is distant from the scene. Video quality can vary considerably due to sun angle, haze, rain and other environmental conditions. Sensor gimbal motion, sensor field of view changes, and sensor jitter will influence the presence and appearance of objects within each image. Obscuration and occlusion will vary with ground activity, changes in viewing perspective, and site-specific obstructions. Operational video sources may utilize visible or infrared wavelengths, with infrared display options including white-hot and black-hot settings.”
The VIRAT program will be conducted in 3 phases.
Phase 1 – Prototype Algorithm Development and System Design. They’ll use government-provided Day TV and IR data from a controlled collection made by “Predator-type sensors.” The Phase 1 system have a coherent system architecture, and work with existing military systems used by the program’s “transition partners.” The goal is 85% identification of target events or activities from the controlled archives, with no more than 8 false positives per video stream hour. When using real-time streaming, the goal is 75% identification from the streaming video, with no more than 12 false positives per video stream hour. The system should be able to change to a different stored query within 1 second, and set up a new query for viewing in 10 minutes.
BAE Systems, Kitware, and Lockheed Martin each received Phase 1 contracts from September – October 2008. The contracts ran to March 2010.
Phase 2 – Algorithm Optimization and System Integration. Basically, improve performance and identify a wider variety of event types. The goal is now 90% identification of target events or activities from the controlled archives, with no more than 4 false positives per video stream hour. When using real-time streaming, the goal is 85% identification from the streaming video, with no more than 6 false positives per video stream hour. The system should be able to change to a different stored query within 1 second, and set up a new query for viewing in 5 minutes.
In August 2010, Kitware won the down-select to enter Phase 2. See Aug 27/10 entry for more details, but the biggest change to their Phase II team is the addition of Phase I competitors Lockheed Martin and BAE Systems.
Phase 3 – Integration, Demonstration, and Transition to the military services. Phase 3 efforts must handle query refinement and complex searches that include multiple event types, handle faster rates of streaming video, and larger video archives, and use actual operational Day TV and IR data from a wide range of UAV platforms. The goal is now 95% identification of target events or activities, with no more than 2 false positives per video stream hour, whether the query involves archives or streaming video. The system should be able to change to a different stored query within 1 second, and set up a new query for viewing in 1 minute.
The US Defense Advanced Research Projects Agency in Arlington, VA manages these contracts.
Aug 27/10: Kitware, Inc. in Clifton Park, NY receives an $11 million cost plus fixed-price contract modification under the VIRAT program. Kitware representatives have confirmed to DID that this is the VIRAT Phase 2 award, and that they have won the down-select.
Their Phase II program team is somewhat different, and includes both of their Phase I competitors, as well as a few other additions. More to the point, Lockheed Martin will serve as the system integrator for this Phase II effort:
- Honeywell Laboratories, ACS
- BAE Systems, Technology Solutions
- General Dynamics
- Lockheed Martin Missiles and Fire Control Autonomous Systems
- Mayachitra, Inc.
- Raytheon BBN Technologies
- Collaborating univerisities including: California Institute of Technology (CalTech); University of California (Berkeley, Irvine, Riverside); University of Southern California; Cornell University; University of Central Florida’s Computer Vision Lab; University of Maryland’s Computer Vision Laboratory; Massachusetts Institute of Technology; Rensselaer Polytechnic Institute; Stanford University; and the University of Texas at Austin.
Work will be performed in Clifton Park, NY (47.1%); Littleton, CO (20.5%); Santa Barbara, CA (6.7%); Burlington, MA (3.2%); Golden Valley, MN (3.0%); Los Angeles, CA (2.1%); Orlando, FL (2.1%); New York, NY (2.1%); Berkeley, CA (2.1%); College Park, MD (1.8 %); Troy, NY (1.7%); Austin, TX (1.7%); Herndon, VA (1.5%); Cambridge, MA (1.0%); Riverside, CA (0.9%); Ithaca, NY (0.9%); Cambridge, MA (0.9%); Pasadena, CA (0.7%). Work is expected to be complete by February 2012 (HR0011-08-C-0135).
Oct 3/08: Lockheed Martin Missile and Fire Control in Orlando, FL wins a $5.5 million cost plus fixed fee contract under VIRAT.
Work will be performed in Cherry Hill, NJ; Orlando, FL, Philadelphia, PA, Pittsburgh, PA, and Littleton, CO, with an estimated completion date of March 29/10. Bids were solicited via the Web, with 20 bids received (HR0011-09-C-0027).
Sept 19/08: BAE Systems National Security Solutions in Burlington, MA receives a $7.2 million cost plus fixed fee contract funder VIRAT.
Work will be performed in Burlington, MA (70%), Cambridge, MA (21%), and Los Angeles, CA (9%), and is expected to be complete in March 2010. Funds obligated at time of award ($1,803,138) will not expire at the end of the current fiscal year. Bids were solicited via the Web, with 20 bids received (HR0011-08-C-0134).
Sept 15/08: Kitware Inc. in Clifton Park, NY received a $6.7 million firm-fixed-price contract under VIRAT. Kitware’s team proposed building a revolutionary video analyst workstation called Video and Image-Based Retrieval and ANalysis Tool (VIBRANT).
Their team includes General Dynamics Advanced Information Systems, Honeywell Labs, Carnegie Mellon University, the International Computer Science Institute affiliated with UC Berkeley, California Institute of Technology (CalTech), Columbia University, the University of Texas at Austin, Rensselaer Polytechnic Institute, and the University of Maryland.
Work will be performed in Clifton Park, NY; Golden Valley, MN; Ypsilanti, NY; College Park, MD; Austin, TX; Berkeley, CA; Pittsburgh, PA; Troy, NY; and Pasadena, CA, with an estimated completion date of March 11/10. Bids were solicited via the Web, with 20 bids received (HR0011-08-C-0135). See also Kitware release.
- DARA IPTO – Video and Image Retrieval and Analysis Tool (VIRAT). Includes solicitation.
- MITRE Corp. (May 2010) – War, Drones, and Videotape: A New Tool for Analyzing Video
- Ars Technica (Oct 21/08) – DARPA building search engine for video surveillance footage
- Washington Post (Oct 20/08) – DARPA Contract Description Hints at Advanced Video Spying
- DID – Too Much Information: Taming the UAV Data Explosion. Details some of the early efforts underway.