Want to create an interactive transcript for this episode?
Podcast: Chaos Computer Club - recent audio-only feed
Episode: Behaviour-Based Quality Assessment of OpenStreetMap Data in Data Scarce Area Using Unsupervised Machine Learning (sotm2025)
Description: This study introduces a behavior-dependent, unsupervised machine learning approach to assess the intrinsic quality of OpenStreetMap (OSM) data in Dhaka, which is both data-starved and urbanizing rapidly urbanizing area. Leveraging enriched contributor metadata and Principal Component Analysis (PCA), latent behavioral patterns and segmented contributors identified using KMeans and HDBSCAN. The silhouette score for PCA-based clustering was 0.951. The results show superior interpretability of KMeans over HDBSCAN. This repeatable methodology provides a scalable and reference-free solution to take quality assurance of VGI datasets to the front-line, in cases of limited or no authoritative data.
OpenStreetMap (OSM) is an important source of geospatial inform...