Want to create an interactive transcript for this episode?
Podcast: Machine Learning Street Talk (MLST)
Episode: Transformers Need Glasses! - Federico Barbero
Description: Federico Barbero (DeepMind/Oxford) is the lead author of "Transformers Need Glasses!". Have you ever wondered why LLMs struggle with seemingly simple tasks like counting or copying long strings of text? We break down the theoretical reasons behind these failures, revealing architectural bottlenecks and the challenges of maintaining information fidelity across extended contexts.Federico explains how these issues are rooted in the transformer's design, drawing parallels to over-squashing in graph neural networks and detailing how the softmax function limits sharp decision-making.But it's not...