Want to create an interactive transcript for this episode?
Podcast: The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Episode: Reasoning Over Complex Documents with DocLLM with Armineh Nourbakhsh
Description: Today we're joined by Armineh Nourbakhsh of JP Morgan AI Research to discuss the development and capabilities of DocLLM, a layout-aware large language model for multimodal document understanding. Armineh provides a historical overview of the challenges of document AI and an introduction to the DocLLM model. Armineh explains how this model, distinct from both traditional LLMs and document AI models, incorporates both textual semantics and spatial layout in processing enterprise documents like reports and complex contracts. We dig into her teamβs approach to training DocLLM, their choice of a generative model as opposed to an encoder-based approach, the da...