Want to create an interactive transcript for this episode?
Podcast: Machine Learning Street Talk (MLST)
Episode: #73 - YASAMAN RAZEGHI & Prof. SAMEER SINGH - NLP benchmarks
Description: Patreon: https://www.patreon.com/mlst
Discord: https://discord.gg/ESrGqhf5CB
YT version: https://youtu.be/RzGaI7vXrkk
This week we speak with Yasaman Razeghi and Prof. Sameer Singh from UC Urvine. Yasaman recently published a paper called Impact of Pretraining Term Frequencies on Few-Shot Reasoning where she demonstrated comprehensively that large language models only perform well on reasoning tasks because they memorise the dataset. For the first time she showed the accuracy was linearly correlated to the occurance rate in the training corpus, something which OpenAI should have done in the...