August 2012
Volume 12, Issue 9
Free
Vision Sciences Society Annual Meeting Abstract  |   August 2012
Statistics of natural action structures and human action recognition
Author Affiliations
  • Xiaoyuan Zhu
    Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, USA
  • Zhiyong Yang
    Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, USA\nDepartment of Ophthalmology, Georgia Health Sciences University, Augusta, Georgia, USA
  • Joe Tsien
    Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, USA\nDepartment of Neurology, Georgia Health Sciences University, Augusta, Georgia, USA
Journal of Vision August 2012, Vol.12, 834. doi:10.1167/12.9.834
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Xiaoyuan Zhu, Zhiyong Yang, Joe Tsien; Statistics of natural action structures and human action recognition. Journal of Vision 2012;12(9):834. doi: 10.1167/12.9.834.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Humans can easily detect, recognize, and classify a range of actions very quickly. Despite enormous research efforts, what spatial-temporal features should be encoded and what statistics of these features are in natural human actions are still unknown. In this work, we proposed natural action structures, i.e., multi-size, multi-scale, spatial-temporal concatenations of features, as the basic encoding units of natural human actions. We took several steps to compile these structures. First, we sampled a large number of sequences of circular patches at multiple spatial and temporal scales. The spatial and temporal scales were so coupled that the sequences at finer spatial scales had shorter durations. Second, we performed independent component analysis on the patch sequences and classified the obtained independent components into clusters using the k-mean method. Finally, we compiled a large set of natural action structures with each corresponding to a unique combination of the clusters at all the spatial and temporal scales. We examined the statistics of these natural action structures and selected a set of highly informative structures for action recognition. To evaluate the utilities of these natural action structures, we used them as inputs to two widely used methods for pattern recognition, i.e., Latent Dirichlet Allocation and Support Vector Machine, to classify a range of human actions in the popular KTH and Weizmann datasets. We found that natural action structures obtained in this way achieved a significantly better recognition performance than simple spatial-temporal features and that the performance was better than or comparable to the best current models. We thus concluded that natural action structures can be used as the basic encoding units of human actions and activities and may hold the key to the understanding of human ability of action recognition.

Meeting abstract presented at VSS 2012

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×