The client is a global entertainment company based in North America. They take their productions to hundreds of cities worldwide, entertaining millions of fans and audience.
The client traditionally distributed feedback forms at the end of most of their performances to assess the overall impact. The forms didn’t track the reaction throughout the performance and were more likely to be influenced by the last few minutes. It was hard to analyze whether a performance met its director’s intended goal, which part of the act audience connected more to, and which actor played a role better. Hence, they decided to measure the impact in real time throughout the entire performance timeline.
The client engaged Persistent, a Google Cloud Partner, to build a Minimum Viable Model (MVM) using Persistent’s patent pending solution on Google Cloud Platform (GCP).
The MVM measured the impact of the performance by processing the video capture of the audience reaction. Before running the videos through the MVM, they were pre-processed and enhanced with color correction and brightness check. The MVM had two defined video processing pipelines and each pipeline automatically ingested and processed videos from Google Cloud Storage. Google Compute Engine orchestrated the workflow and video processing, and finally the results were stored to Google Cloud Storage.
The two pipelines invoked in the MVM included:
Facial Sentiment Analysis
The videos were sent to the Google Vision APIs for face detection using Deep Learning. Another Deep Learning model quantified the emotion on each face recognized in the videos on a range of -50 and +50.
The results were then displayed on a timeline to show different emotions and their intensities. The client measured their performance’s impact by comparing their intended emotion timeline to the audience’s emotion timeline.
Action Detection Analysis
A Deep Learning based action detection model detected applause by considering the videos’ temporal information and processing a set of frames sequentially. Being able to determine when and what part of the audience is applauding gave an insight into their performances’ effectiveness and audience engagement.
Results
The MVM was delivered in 3 weeks and resulted in the following:
- Face detection and emotion classification to understand the impact of the performance
- Applause detection to measure audience engagement
As a next step, the client and Persistent are discussing other pipelines, like – identifying the emotions in audio and analyzing the ambience – whether spectators are entering or leaving the venue, ratio of empty and filled seats, and more.