In this paper, we look at how depth data can benefit existing object masking methods applied in occluded scenes. Masking the pixel locations of objects within scenes helps computers get a spatial awareness of where objects are within images. The current state-of-the-art algorithm for masking objects in images is Mask R-CNN, which builds on the Faster R-CNN network to mask object pixels rather than just detecting their bounding boxes. This paper examines the weaknesses Mask R-CNN has in masking people when they are occluded in a frame. It then looks at how depth data gathered from an RGB-D sensor can be used. We provide a case study to show how simply applying thresholding methods on the depth information can aid in distinguishing occluded persons. The intention of our research is to examine how features from depth data can benefit object pixel masking methods in an explainable manner, especially in complex scenes with multiple objects.
Published Date: 2020-06-02
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print) ISBN 978-1-57735-835-0 (10 issue set)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2020, Association for the Advancement of Artificial Intelligence All Rights Reserved