Report on integrated semantic evaluation metric

Summary
This report will describe our final approach to human and automatic evaluation of semantic accuracy. (T5.2, T5.3)