Graph CNNs with Motif and Variable Temporal Block for Skeleton-Based Action Recognition

Authors

  • Yu-Hui Wen Chinese Academy of Sciences
  • Lin Gao Chinese Academy of Sciences
  • Hongbo Fu City University of Hong Kong
  • Fang-Lue Zhang Victoria University of Wellington
  • Shihong Xia Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v33i01.33018989

Abstract

Hierarchical structure and different semantic roles of joints in human skeleton convey important information for action recognition. Conventional graph convolution methods for modeling skeleton structure consider only physically connected neighbors of each joint, and the joints of the same type, thus failing to capture highorder information. In this work, we propose a novel model with motif-based graph convolution to encode hierarchical spatial structure, and a variable temporal dense block to exploit local temporal information over different ranges of human skeleton sequences. Moreover, we employ a non-local block to capture global dependencies of temporal domain in an attention mechanism. Our model achieves improvements over the stateof-the-art methods on two large-scale datasets.

Downloads

Published

2019-07-17

How to Cite

Wen, Y.-H., Gao, L., Fu, H., Zhang, F.-L., & Xia, S. (2019). Graph CNNs with Motif and Variable Temporal Block for Skeleton-Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 8989-8996. https://doi.org/10.1609/aaai.v33i01.33018989

Issue

Section

AAAI Technical Track: Vision