LanguaTalk

Want to create an interactive transcript for this episode?

View more episodes

Podcast: Nimdzi LIVE!

Episode: TranslateGemma Quality Evaluation / Stress Test feat Alex Murauski

Description: In this session, we will explore how we evaluated the translation quality of Google’s Gemma model using the MQM framework and a human-in-the-loop review process. The case study walks through how LLM-generated translations were assessed using structured error typology, how linguistic quality was benchmarked, and how AI-enhanced workflows can combine automated generation with professional post-editing and evaluation. We’ll discuss: How MQM works in real-world AI evaluation What kinds of errors LLMs produce across languages Where AI performs well — and where it still struggles How to design...

Click any word to see translations, usage examples & similar words. Then learn them using saved words.

Text not synced with the audio? See here for why certain podcasts won't sync.

Key for transcripts:

saved words | learned words

Colours will update after you refresh the page.

Useful pages

Find a tutor

Languages