Study of Large Data Resources for Multilingual Training and System Porting