Sergiy Turchyn has been working on this project. Here is the current process used by no_gesture_detection.py:
1) Look at the video frame by frame. Get features for each frame. Frame features is a list of features for each person in the frame.
Currently the features include face and hand positions. Face is detected using haar cascade classifier provided by opencv. Hands are detected for each person using K-means clustering using face colors and positions as features. For each person, left and right hand are determined by looking at hand location and comparing it to the previous frame.
2) For each frame decide if there is motion happening or not (previous and next frames are used for this too).
There is motion in a frame if the number of people is different from the previous frame, people cannot be matched to the previous frame (their faces are not close enough in the two frames), a person's hand could not be matched to the previous frame, or one of the hands moved with speed similar to the true positive data from the timeline gesture detection.The existing code takes about 125 minutes to run on a 1 hour video. There has been no effort to improve the running time or make it multithreaded.