Want to create an interactive transcript for this episode?
Podcast: The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Episode: Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini
Description: Today, we're joined by Julie Kallini, PhD student at Stanford University to discuss her recent papers, âMrT5: Dynamic Token Merging for Efficient Byte-level Language Modelsâ and âMission: Impossible Language Models.â For the MrT5 paper, we explore the importance and failings of tokenization in large language modelsâincluding inefficient compression rates for under-resourced languagesâand dig into byte-level modeling as an alternative. We discuss the architecture of MrT5, its ability to learn language-specific compression rates, its performance on multilingual benchmarks and character-level manipulation tasks, and its performance and efficiency. For the âMission: Impossible Language Modelsâ paper, we review the core idea behind the...