eduzhai > Applied Sciences > Engineering >

AttentionNAS Spatiotemporal Attention Cell Search for Video Classification

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 23 pages

Abstract: Convolutional operations have two limitations: (1) do not explicitly modelwhere to focus as the same filter is applied to all the positions, and (2) areunsuitable for modeling long-range dependencies as they only operate on a smallneighborhood. While both limitations can be alleviated by attention operations,many design choices remain to be determined to use attention, especially whenapplying attention to videos. Towards a principled way of applying attention tovideos, we address the task of spatiotemporal attention cell search. We proposea novel search space for spatiotemporal attention cells, which allows thesearch algorithm to flexibly explore various design choices in the cell. Thediscovered attention cells can be seamlessly inserted into existing backbonenetworks, e.g., I3D or S3D, and improve video classification accuracy by morethan 2 on both Kinetics-600 and MiT datasets. The discovered attention cellsoutperform non-local blocks on both datasets, and demonstrate stronggeneralization across different modalities, backbones, and datasets. Insertingour attention cells into I3D-R50 yields state-of-the-art performance on bothdatasets.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...