Human action recognition is a complex area of computer vision since static object characteristics, motion and time information have to be taken into account. Furthermore, actions are divided into human actions (walking, running, jogging), human-human interactions (handshaking, kissing, punching), human-object interactions (calling, writing, driving car) and group activities (football, soccer, group stealing). Due to environment variations such as moving backgrounds, different view points or occlusions the detection and classification of actions is even more difficult. Additionally, each actor has its own style of performing an action, leading to many variations in the subject's movement and a large intra-class variation.
A Random Forest consists of CART-like decision trees that are independently constructed on a bootstrap sample. Compared to other ensemble learning algorithms, i.e. boosting that build a flat tree structure of decision stumps, a Random Forest uses an ensemble of decision trees and is multi-class capable.
A tree is grown using the following algorithm:
A completed classifier consists of several trees in which the class probabilities, estimated by majority voting, are used to calculate the sample's label.
Hollywood dataset:
UCF Sports dataset:
KTH dataset:
TNT Action Recognition using Motion Binary Patterns:
Acceleration-based with inertial sensor by http://www.xsens.com