First language models trained

Summary
Language models will be made available for download however it may not have all or the cleanest data.