In latest news, the Google AI is watching us. Not literally, but with YouTube. Google has provided over 50,000 selected videos to their AI system which will help it understand and predict human behavior in a better manner. A total of 57,600 clips are provided to the AI, where over 96,000 humans can be seen. It focuses on 80 actions. These clips are from movies of various genres and from various countries. Google has labeled each person in the video separately so that the AI knows if there are two people in a certain situation they shake hands, or they hug. In some situations they kiss while they hug, etc. The Google AI is going through all these clips and will process and understand how humans interact in social situations in a better manner.
This dataset of videos which the Google AI is going through is collectively called as AVA (Atomic Visual Actions). All these clips are of three seconds and are of humans doing basic things such as hugging, shaking hands, drinking from a bottle, etc. “Despite exciting breakthroughs made over the past years in classifying and finding objects in images, recognizing human actions still remains a big challenge,” Google commented. “This is due to the fact that actions are, by nature, less well-defined than objects in videos.”