Want to create an interactive transcript for this episode?
Podcast: Software Engineering Daily
Episode: Datahub: Open Source Data Lake with Pardhu Gunnam and Mars Lan
Description: As the volume and scope of data collected by an organization grow, tasks such as data discovery and data management grow in complexity. Simply put, the more data there is, the harder it is for users such as data analysts to find what they’re looking for. A metadata hub helps manage Big Data by providing metadata search and discovery tools, and a centralized hub which presents a holistic view of the data ecosystem. DataHub is Linkedin’s open-sourced metadata search and discovery tool. It is Linkedin’s second generation of metadata hubs after WhereHows. Pardhu Gunnam and Mar...