Want to create an interactive transcript for this episode?
Podcast: Chaos Computer Club - recent audio-only feed
Episode: 51 Ways to Spell the Image Giraffe: The Hidden Politics of Token Languages in Generative AI (39c3)
Description: Generative AI models don't operate on human languages – they speak in **tokens**. Tokens are computational fragments that deconstruct language into subword units, stored in large dictionaries. These tokens encode not only language but also political ideologies, corporate interests, and cultural biases even before model training begins. Social media handles like *realdonaldtrump*, brand names like *louisvuitton*, or even *!!!!!!!!!!!!!!!!* exist as single tokens, while other words remain fragmented. Through various artistic and adversarial experiments, we demonstrate that tokenization is a political act that determines what can be represented and how images become computable through language.
Tokens are the fragments of words that ge...