eduzhai > Applied Sciences > Engineering >

Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization

  • Save

... pages left unread,continue reading

Document pages: 11 pages

Abstract: Localizing persons and recognizing their actions from videos is a challengingtask towards high-level video understanding. Recent advances have been achievedby modeling direct pairwise relations between entities. In this paper, we takeone step further, not only model direct relations between pairs but also takeinto account indirect higher-order relations established upon multipleelements. We propose to explicitly model the Actor-Context-Actor Relation,which is the relation between two actors based on their interactions with thecontext. To this end, we design an Actor-Context-Actor Relation Network(ACAR-Net) which builds upon a novel High-order Relation Reasoning Operator andan Actor-Context Feature Bank to enable indirect relation reasoning forspatio-temporal action localization. Experiments on AVA and UCF101-24 datasetsshow the advantages of modeling actor-context-actor relations, andvisualization of attention maps further verifies that our model is capable offinding relevant higher-order relations to support action detection. Notably,our method ranks first in the AVA-Kineticsaction localization task ofActivityNet Challenge 2020, out-performing other entries by a significantmargin (+6.71mAP). Training code and models will be available atthis https URL.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...