Tagging for Likelihood of Gesture Data
Task
Develop a system for detecting likelihood of gesture data in an interval of an audiovisual file or even an individual snapshot.
Can you help Red Hen improve the ELAN-to-RedHen integration?
If so, write to
and we will try to connect you with a mentor.
Related pages
- Gesture detection 2017 (Sergiy Turchyn's project, with slides)
- OpenPose and Hand Keypoint Detection using Deep Learning and OpenCV
- Red Hen Rapid Annotator
- Manual tagging (with proposed Red Hen gesture tagging scheme)
- How to annotate with ELAN (simple instructions to get started)
- How to set up the iMotion annotator (draws rectangles on images to indicate event location)
- How to use the online tagging interface (integrated into Red Hen, but not frame accurate)
- How to use the Video Annotation Tool (online multi-dimensional video annotation interface for talks and demos)
- Integrating ELAN
- Machine Learning
- Video processing pipelines
More information
Sergiy Turchyn has been working on this project, see Gesture detection 2017. Here is the current process used by no_gesture_detection.py:
1) Look at the video frame by frame. Get features for each frame. Frame features is a list of features for each person in the frame.
Currently the features include face and hand positions. Face is detected using haar cascade classifier provided by opencv. Hands are detected for each person using K-means clustering using face colors and positions as features. For each person, left and right hand are determined by looking at hand location and comparing it to the previous frame.
2) For each frame decide if there is motion happening or not (previous and next frames are used for this too).
There is motion in a frame if the number of people is different from the previous frame, people cannot be matched to the previous frame (their faces are not close enough in the two frames), a person's hand could not be matched to the previous frame, or one of the hands moved with speed similar to the true positive data from the timeline gesture detection.
The existing code takes about 125 minutes to run on a 1 hour video. There has been no effort to improve the running time or make it multithreaded.