Multimodal and privacy-aware audio-visual intelligence - final version

Summary
This deliverable is the final version of the multimodal and privacy-aware audio-visual intelligence.