LanguaTalk

Want to create an interactive transcript for this episode?

View more episodes

Podcast: Software Engineering Daily

Episode: Snorkel: Training Dataset Management with Braden Hancock

Description: Machine learning models require the use of training data, and that data needs to be labeled. Today, we have high quality data infrastructure tools such as TensorFlow, but we don’t have large high quality data sets. For many applications, the state of the art is to manually label training examples and feed them into the training process.Snorkel is a system for scaling the creation of labeled training data. In Snorkel, human subject matter experts create labeling functions, and these functions are applied to large quantities of data in order to label it. For exa...

Click any word to see translations, usage examples & similar words. Then learn them using saved words.

Text not synced with the audio? See here for why certain podcasts won't sync.

Key for transcripts:

saved words | learned words

Colours will update after you refresh the page.

Useful pages

Find a tutor

Languages