Understanding Biases and Diversity of Big Data used for Mobility Analysis


Vores evne til at indsamle, lagre og analysere store mængder data er steget markant i de sidste to årtier, og i dag spiller big data en kritisk rolle i størstedelen af statistiske algoritmer. Desværre er vores forståelse af bias i data ikke fulgt med. Selvom der er sket store fremskridt i udviklingen af nye modeller til at analysere data, har der været langt mindre fokus på at forstå de grundlæggende mangler ved big data.

Dette projekt vil kvantificere de bias og usikkerheder, der er forbundet med menneskelig mobilitetsdata indsamlet gennem digitale midler, såsom smartphone GPS-spor, mobiltelefondata og sociale mediedata.

I sidste ende ønsker vi at stille spørgsmålet: er det muligt at rette op på big mobility data gennem en grundlæggende forståelse af, hvordan bias manifesterer sig?

Value creation

We expect this project to have a long-lasting scientific and societal impact. The scientific impact of this work will allow us to explicitly model bias in algorithmic systems relying on human mobility data and provide insights into which population are left out. For example, it will allow us to correct for gender, wealth, age, and other types of biases in data globally used for epidemic modeling, urban planning, and many other usecases.

Further, having methods to debias data will allow us to understand what negative impacts results derived from biased data might have. Given the universal nature of bias, we expect our developed debiasing frameworks will also pave the way for quantitative studies of bias in other realms of data science.

The societal impact will be actionable recommendations provided to policy makers regarding: 1) guidelines for how to safely use mobility datasets in data-driven decision processes, 2) tools (including statistical and interactive visualizations) for quantifying the effects of bias in data, and 3) directions for
building fairer and equitable algorithm that rely on mobility data.

It is important to address these issues now, because in their “Proposal for a Regulation on a European approach for Artificial Intelligence” from April 2021 the European Commission (European Union) outlines potential future regulations for addressing the opacity, complexity, bias, and unpredictability of algorithmic systems.

This document states that high-quality data is essential for algorithmic performance and suggest that any dataset should be subject to appropriate data governance and management practices, including examination in view of possible biases. This implies that in the future businesses and governmental agencies will need to have data-audit methods in place. Our project addresses this gap and provides value by
developing methodologies to audit mobility data for different types of biases — producing tools which Danish society and Danish businesses will benefit from.

Project Manager

Vedran Sekara

Assistant Professor

IT University of Copenhagen
Department of Computer Science

E: vsek@itu.dk

Laura Alessandretti

Associate Professor

Technical University of Denmark
DTU Compute

Manuel Garcia-Herranz

Chief Scientist

New York

Elisa Ormodei

Assistant Professor

Central European University