Multimodal and privacy-aware audio-visual intelligence - initial version

Summary
This deliverable is the initial version of the multimodal and privacy-aware audio-visual intelligence.