An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Valentin Hofmann, Hinrich Schuetze, Janet Pierrehumbert

Journal title: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Journal number: May 2022

Journal publisher: Association for Computational Linguistics

Published year: 2022