Generalized Hough transform based time invariant action recognition with 3D pose information

David Muench; Wolfgang Huebner; Michael Arens

doi:10.1117/12.2065805

7 October 2014 Generalized Hough transform based time invariant action recognition with 3D pose information

David Muench, Wolfgang Huebner, Michael Arens

Proceedings Volume 9253, Optics and Photonics for Counterterrorism, Crime Fighting, and Defence X; and Optical Materials and Biomaterials in Security and Defence Systems Technology XI; 92530K (2014) https://doi.org/10.1117/12.2065805
Event: SPIE Security + Defence, 2014, Amsterdam, Netherlands

Abstract

Human action recognition has emerged as an important field in the computer vision community due to its large number of applications such as automatic video surveillance, content based video-search and human robot interaction. In order to cope with the challenges that this large variety of applications present, recent research has focused more on developing classifiers able to detect several actions in more natural and unconstrained video sequences. The invariance discrimination tradeoff in action recognition has been addressed by utilizing a Generalized Hough Transform. As a basis for action representation we transform 3D poses into a robust feature space, referred to as pose descriptors. For each action class a one-dimensional temporal voting space is constructed. Votes are generated from associating pose descriptors with their position in time relative to the end of an action sequence. Training data consists of manually segmented action sequences. In the detection phase valid human 3D poses are assumed as input, e.g. originating from 3D sensors or monocular pose reconstruction methods. The human 3D poses are normalized to gain view-independence and transformed into (i) relative limb-angle space to ensure independence of non-adjacent joints or (ii) geometric features. In (i) an action descriptor consists of the relative angles between limbs and their temporal derivatives. In (ii) the action descriptor consists of different geometric features. In order to circumvent the problem of time-warping we propose to use a codebook of prototypical 3D poses which is generated from sample sequences of 3D motion capture data. This idea is in accordance with the concept of equivalence classes in action space. Results of the codebook method are presented using the Kinect sensor and the CMU Motion Capture Database.

Citation Download Citation

David Muench, Wolfgang Huebner, and Michael Arens "Generalized Hough transform based time invariant action recognition with 3D pose information", Proc. SPIE 9253, Optics and Photonics for Counterterrorism, Crime Fighting, and Defence X; and Optical Materials and Biomaterials in Security and Defence Systems Technology XI, 92530K (7 October 2014); https://doi.org/10.1117/12.2065805

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
11 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Video

Video surveillance

Hough transforms

Sensors

3D modeling

Prototyping

Statistical modeling

Show All Keywords

Keywords/Phrases

Search In:

Publication Years