Our capabilities to collect, store and analyze vast amounts of data have greatly increased in the last two decades, and today big data plays a critical role in a large majority of statistical algorithms. Unfortunately, our understanding of biases in data has not kept up. While there has been lot of progress in developing new models to analyze data, there has been much less focus on understanding the fundamental shortcomings of big data.
This project will quantify the biases and uncertainties associated with human mobility data collected through digital means, such a smartphone GPS traces, cell phone data, and social media data.
Ultimately, we want to ask the question: is it possible to fix big mobility data through a fundamental understanding of how biases manifest themselves?
Project period: 2022-2024
Project Manager
Value creation
We expect this project to have a long-lasting scientific and societal impact. The scientific impact of this work will allow us to explicitly model bias in algorithmic systems relying on human mobility data and provide insights into which population are left out. For example, it will allow us to correct for gender, wealth, age, and other types of biases in data globally used for epidemic modeling, urban planning, and many other usecases.
Further, having methods to debias data will allow us to understand what negative impacts results derived from biased data might have. Given the universal nature of bias, we expect our developed debiasing frameworks will also pave the way for quantitative studies of bias in other realms of data science.
The societal impact will be actionable recommendations provided to policy makers regarding: 1) guidelines for how to safely use mobility datasets in data-driven decision processes, 2) tools (including statistical and interactive visualizations) for quantifying the effects of bias in data, and 3) directions for
building fairer and equitable algorithm that rely on mobility data.
It is important to address these issues now, because in their “Proposal for a Regulation on a European approach for Artificial Intelligence” from April 2021 the European Commission (European Union) outlines potential future regulations for addressing the opacity, complexity, bias, and unpredictability of algorithmic systems.
This document states that high-quality data is essential for algorithmic performance and suggest that any dataset should be subject to appropriate data governance and management practices, including examination in view of possible biases. This implies that in the future businesses and governmental agencies will need to have data-audit methods in place. Our project addresses this gap and provides value by
developing methodologies to audit mobility data for different types of biases — producing tools which Danish society and Danish businesses will benefit from.