Behavior Recognition in Video with Extended Models of Feature Velocity Dynamics

Ross Messing, Christopher Pal

Investigations of human perception have shown that non-local spatio-temporal information is critical and often sufficient for activity recognition. However, many recent activity recognition systems have been largely based on local space-time features and statistical techniques inspired by object recognition research. We develop a new set of statistical models for feature velocity dynamics capable of representing the long term motion of features. We show that these models can be used to effectively disambiguate behaviors in video, particularly when extended to include information not captured by motion, like position and appearance. We demonstrate performance surpassing and in some cases doubling the accuracy of a state-of-the-art approach based on local features. We expect that long range temporal information will become more important as technology makes longer, higher resolution videos commonplace.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.