Audio-visual algorithms for person tracking and characterization (baseline)

Summary
Demonstrating Tasks T21 T22 and T24