Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Szatkowski, Filip; Wójcik, Bartosz; Piórczyński, Mikołaj; Scardapane, Simone

Journal title: The Thirty-Eighth Annual Conference on Neural Information Processing Systems

Journal publisher: NeurIPS

Published year: 2023

DOI identifier: 10.5281/zenodo.14409485