Finding and tracking targets and events in a live video feed is important for many commercial applications, from CCTV surveillance used by police and security firms, to the rapid mapping of events from aerial imagery. However, descriptions of targets are typically provided in natural language by the end users, and interpreting these in the context of a live video stream is a complex task. Due to current limitations in artificial intelligence, especially vision, this task cannot be automated and instead requires human supervision. Hence, in this paper, we consider the use of real-time crowdsourcing to identify and track targets given by a natural language description. In particular we present a novel method for augmenting live video with a real-time crowd.
Published Date: 2015-11-12
Registration: ISBN 978-1-57735-740-7