Want to create an interactive transcript for this episode?
Podcast: Machine Learning Street Talk (MLST)
Episode: Prof. Randall Balestriero - LLMs without pretraining and SSL
Description: Randall Balestriero joins the show to discuss some counterintuitive findings in AI. He shares research showing that huge language models, even when started from scratch (randomly initialized) without massive pre-training, can learn specific tasks like sentiment analysis surprisingly well, train stably, and avoid severe overfitting, sometimes matching the performance of costly pre-trained models. This raises questions about when giant pre-training efforts are truly worth it.He also talks about how self-supervised learning (where models learn from data structure itself) and traditional supervised learning (using labeled data) are fundamentally similar, allowing researchers to apply decades...