Want to create an interactive transcript for this episode?
Podcast: Arxiv Papers
Episode: [QA] Position: The Most Expensive Part of an LLM should be its Training Data
Description: This paper argues that compensating human labor for training data is the largest cost in developing Large Language Models, significantly exceeding model training expenses, and suggests fairer practices for the future.https://arxiv.org/abs//2504.12427YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers<...