Kategorier
Afsluttet projekt AI Future of work Green Tech Nyheder

Forklarlig AI skal disrupte kornindustrien og sikre tillid blandt landbrugere

4. juli 2023

Forklarlig AI skal disrupte kornindustrien og sikre tillid blandt landbrugere

Landbrugssektoren er den mindst digitaliserede sektor i verden, og en stor del af kvalitetssikringen af fødevarer foregår stadig manuelt. Et forskningsprojekt skal styrke forståelsen for og tilliden til AI og billedanalyse, som kan forbedre kvalitetssikringen, fødevarekvaliteten og optimere produktionen.

En af de helt store kritiske barrierer ved at bruge AI og billedanalyse i landbrugs- og fødevareindustrien, er tilliden til, at det virker.
 
I dag er den manuelle visuelle inspektion af korn stadig en af de vigtigste kvalitetssikringsproducerer i hele værdikæden for at bringe korn fra marken til bordet – og sikre, at landbrugeren får den rigtige pris for sine korn.  

Can you find ‘Okapi’ in these pictures? Ph.D. student Lenka Tetková from DTU uses this example to explain how image classification works.

An important competitive advantage

As a global producer of niche products, FOSS must always stay two steps ahead of competitors.

– To ensure there is a market for us in the future, it is crucial to be the first with new solutions. It is challenging to make a profit if there is already a player doing it better, which is why we constantly introduce new digital technologies to improve our analysis tools. And here, collaboration with researchers from the country’s universities is very valuable to us, as we gain new insights and proposed solutions for the further development of our tools, says Erik Schou Dreier and continues:

– In this project, we hope that collaboration with researchers will lead to the development of AI methods and tools that enable us to create new solutions for automated image-based quality assessment and, secondly, that we can increase trust in our product with explainable AI. It is one of the critical themes for us—to create a product that is trusted.

Facts about FOSS

FOSS’ measuring instruments are used everywhere in the agriculture and food industry to quality assure a wide range of raw materials and finished food products.

Traditionally, light wavelengths are measured, and the measurements are used to obtain chemical information about a product. This can include knowledge about protein and moisture content in grains or fat and protein in milk, etc.

FOSS’ customers are large global companies that use FOSS’ products to quality assure and optimize their production—and to ensure the right pricing, so, for example, the farmer gets the right price for their grain.

Deep Learning and Automation of Imaging-based Quality of Seeds and Grains

Project Period: 2020-2024
Budget: DKK 3.91 million

Project participants:

Lenka Tetková
Lars Kai Hansen, Professor DTU
Kim Steenstrup Pedersen, Professor, KU
Thomas Nikolajsen, Head of Front-end Innovation, FOSS
Toke Lund-Hansen, Head of Spectroscopy Team, FOSS
Erik Schou Dreier, Senior Scientist, FOSS

What is a Deep Learning Neural Network?

Deep learning neural networks are computer systems inspired by how our brains function. It consists of artificial neurons called nodes organized in layers. Each node takes in information, processes it, and passes it on to the next layer. This helps the network understand data and make predictions. By training the network with examples and adjusting the connections between nodes, it learns to make accurate predictions on new data. Deep learning neural networks are used for tasks such as image recognition, language understanding, and problem-solving.

Kategorier
Afsluttet projekt Explore-projekt

Cyber-Physical Systems with Humans in the Loop

DIREC-projekt

Cyber-physical Systems with humans in the loop

Resumé

At konstruere cyber-fysiske systemer med mennesker i loopet muliggør nye anvendelsesområder, såsom bio-computing, aktive læringssystemer og intelligente medicinske systemer. Mange af de nye applikationer er forestillet, udviklet og implementeret for at gøre det muligt for mennesker og maskiner at samarbejde om virkelige opgaver. Disse applikationer har således aspekter af både Cyber-Physical Systems (CPS) og Socio-Technical Systems (STS) og er karakteriseret ved tæt samarbejde mellem softwareteknologier, herunder design for situationsforståelse, sikkerhed, privatliv, brugervenlighed og nem fejlhåndtering.

For at etablere en samling om emnet vil projektet definere tværfaglig terminologi om de involverede forskningsområder, liste udfordringer med fokus på nye anvendelsesområder og undersøge den nyeste forskning inden for de identificerede udfordringer. På workshops vil projektet for de listede udfordringer kortlægge, hvilke der er vigtige for dansk industri at adressere i fremtidigt arbejde. Projektet vil kombinere litteraturstudier med workshops og fremme fremtidigt samarbejde for at tackle eksisterende udfordringer. Projektets mål er at fremme samarbejde blandt DIREC-partnere om dette emne og udgive en undersøgelse baseret på resultaterne af arbejdet.

Projektperiode: 2021-2023
Budget: 0,46 millioner kr

Project Manager

  • Associate Professor Mahyar Tourchi Moghaddam
  • Maersk Mc-Kinney Moller Institute, SDU
  • mtmo@mmmi.sdu.dk

Scientific value: The project will provide a better terminology and a common understanding of state-of-theart across several areas of research within DIREC and disseminate this knowledge to the scientific community.

Capacity building: The project will establish new collaboration setups within DIREC and involve master students in the activities.

Business value: The project will in workshops disseminate knowledge to Danish industry and identify cases that could be relevant areas of collaboration for DIREC with Danish Industry in future larger projects. The project will among others connect to the community involved in the Nordic IoT Center.

Værdi

Projektet vil udvikle bedre terminologi og et fælles forståelsesgrundlag for den nyeste viden på flere forskningsområder inden for DIREC og formidle denne viden til forskningsverdenen.

Insights

Kategorier
Afsluttet projekt Explore-projekt

Re-Use of Robotic-data in Production through search, simulation and learning

DIREC-projekt

Re-use of Robotic data in production

through Search, Simulation and Learning

Resumé

En robotdatabase med information om tidligere robotløsninger kan spare produktionsvirksomheder for tid og penge og give mindre virksomheder mulighed for også at automatisere deres produktion.

Selvom det lyder enkelt, er der flere udfordringer forbundet med at skabe en robotdatabase. Robotdata er fx komplicerede, da de består af billeder, baner, kraftvektorer, information om forskellige materialer, CAD-filer osv.

Med input fra industri og internationale eksperter har dette afsluttede projekt fået en meget bedre forståelse af udfordringerne. Næste trin er at udvikle software, der giver mulighed for genbrug af robotdata.

Projektperiode: 2021-2022

Projektleder

  • Professor Norbert Krüger
  • Maersk Mc-Kinney Moller Institute, SDU
  • norbert@mmmi.sdu.dk

A robot database with information on previous robot solutions can save manufacturing companies time and money and allow for smaller-scale companies to automate their production as well. This is the conclusion of the ReRoPro project. Although it sounds simple, there are several challenges involved with creating a robot database. For example, robot data are complicated as they consist of images, trajectories, force vectors, information on different materials, CAD-files etc. With input from industry and international experts, the researchers have now gained a much better understanding of the challenges.

Next step is to apply for funding to develop software that allow for the reuse of robot data. The research project took place in a cooperation between the University of Southern Denmark, University of Copenhagen and Aalborg University with the companies Rockwool, Novo Nordisk, Nordbo Robotics and WellTec as partners.

Værdi

Projektet vil opnå værdifuld viden om, hvordan man laver en robotdatabase, der kan spare produktionsvirksomheder for tid og penge og give mindre virksomheder mulighed for at automatisere deres produktion.

Insights

Partnere

Kategorier
Afsluttet projekt Explore-projekt

DeCoRe: Tools and Methods for the Design and Coordination of Reactive Hybrid Systems

DIREC-projekt

DeCore

- Tools and Methods for the Design and Coordination of Reactive Hybrid Systems

Resumé

Et tilbagevendende problem for digitaliserede virksomheder er at designe og koordinere hybride systemer, der inkluderer IoT (Internet of Things), edge- og cloud-løsninger. De nuværende anvendte metoder og værktøjer er ikke effektive til dette formål, fordi de i for høj grad er afhængige af uformelle specifikationer, der er manuelt skrevet og fortolket af mennesker.

Vi sigter mod at udforske anvendeligheden af de nyeste teknologier og metoder udviklet ved SDU, KU og AAU til design af reaktive hybride IoT-edge-cloud-arkitekturer i dansk industri. Disse teknologier er baseret på entydige formelle sprog, som kan behandles af computere for at kontrollere ønskelige designegenskaber (såsom kompatibilitet af softwaregrænseflader) og for at implementere komponenter til overvågning af systemers korrekte funktion. Anvendelse af disse teknikker har vist sig at øge produktiviteten i digitale industrier væsentligt (for eksempel op til 4x stigning i udviklingshastigheden).

Vores mål er at:

  1. gennemføre en konkret brugssag med en partner virksomhed (Sanovo Technology Group)
  2. igangsætte videndeling om dette emne mellem AAU, KU og SDU gennem workshops
  3. formidle vores resultater til resten af DIREC-samfundet.

Projektperiode: 2021-2023

Projektleder

  • Professor Fabrizio Montesi
  • Department of Mathematics and Computer Science, SDU
  • fmontesi@imada.sdu.dk

Scientific value
The scientific value of the project is twofold:

  • (a) concrete knowledge on the advantages and potential challenges brought by the application of cutting-edge techniques like Jolie for the development of hybrid systems (IoT-edge-cloud) in the Danish industry (using Sanovo Technology Group for the case study); and
  • (b) knowledge on the synergies and future directions for the integration of forefront scientific methods for hybrid systems developed by Danish universities (Jolie, UPPAAL, DCR Graphs). Providing a perspective that comes from concrete industrial experience, with substaintiated needs, has significant potential to influence the future development of both research and industrial development in Denmark.

Capacity building
Companies will thus benefit from an increased number of students that they can hire to satisfy their needs with respect to hybrid systems. Universities benefit by gaining sustainable candidates for PhD positions in future projects connected to this exploration.

Business and societal value
Due to the growth potential in solutions for automation and data intensive processing solutions, this project will strengthen Danish competitiveness through a reduced cost of developing deploying and running IoT and cloud software. Potentially, this could lead to increased export of IT products and services.

Insights

Partnere

Kategorier
Afsluttet projekt Uddannelsesprojekt

Initiatives to improve recruitment and retention of IT students

DIREC-projekt

Initiatives to improve recruitment and retention of IT students

Resumé

Danmark har brug for flere IT-specialister. Men hvordan får vi flere unge til at studere datalogi og blive IT-specialister? Dette projekt, som består af to delprojekter, fokuserer på initiativer, der kan forbedre både rekruttering og fastholdelse af en større, men også mere mangfoldig gruppe af unge, f.eks. kvindelige studerende og studerende uden tidligere programmeringserfaring.

Projektleder

  • Professor Claus Brabrand
  • Department of Computer Science, ITU
  • brabrand@itu.dk
Diversity or Not: Heterogeneous vs Homogeneous study groups
Summary

The first subproject Diversity or Not: Heterogeneous vs Homogeneous Student Groups? will study the effect of diversity on the formation of CS student groups. The intent is to uncover evidence to issue recommendations on how to best form project groups. We expect this knowledge to be beneficial for the recruitment and retention of students as well as for the diversity of students.

Value Creation

We expect the outcomes of this project will create significant value for primarily the Danish universities, but also for the Danish tech industry (technology companies). The project intends to derive research-based recommendations on how to best form (student) project groups. Since group work is so widespread in Computer Science education in all of Denmark to foster communication and collaboration skills in connection with a problem, it is important to figure out what works best. This will strengthen the CS education in Denmark.

Studying the impact of diversity on project groups will also be important as a proxy for professional groups in a work context, beyond university (with the obvious external threats to validity of this generalization). We expect this knowledge to be beneficial for the recruitment and retention of students as well as for the diversity of the students (e.g., female students and students without prior programming experience). Aside from the experiments themselves and their findings, we intend to also create and publsh (and seek independent generic approval of) generic experimental protocols for how to ethically and responsibly conduct such group diversity-performance experiments. This includes how to quantify group diversity and group performance. We imagine these generic experimental protocols would be relevant for other studies and companies seeking to specialize them in order to conduct their own more specific instances of the experiments. This also includes ethical considerations surrounding similar student experiments and how to make them ethically safe(r)

D-Pop – A Danish Annual Programming and Problemsolving Event

Summary

The second subproject D-Pop – A Danish Annual Programming and Problem Solving Event will plan, organize, and implement physical D-Pop events at Danish CS departments aimed at young people who are beginning programmers at all levels. The participants get increased programming skills and another perspective on programming and problem solving because focus is on collaboration, creativity, and curiosity.

We expect the events to have a positive effect on recruitment and retention of students as well as for the diversity of students.

Value Creation

The expected results of D-Pop are:

1. Dramatically increased programming skills among participants. This is the expected outcome of just participating, akin to training in any other skill, and includes improved programming language mastery, problem solving skills, resilience, collaboration skills, debugging, and computational problem solving (in particular, algorithmic thinking). This competence boost is independent of the rung of the competence ladder on which the participant starts. I don’t need to reiterate the problems with recruitment of technically competent IT professionals in Denmark.

2. Increased exposure and recruitment. D-Pop complements the existing pallette of outreach and recruitment activities currently used by Danish CS departments. Compared with similar events, D-Pop content is designed with a focus on immediate, satisfying, and positive feedback to beginning programmers, but in a way that is both honest and values competence, agency, and collaboration. Scalability is built into D-Pop’s infrastructure (both technical and social from the start.

3. Establishment of a national network of problem setters. The value of this extends beyond D-Pop and immediately includes teaching material for high schools and universities. For another example, the Danish High School Informatics Olympiad (Dansk datalogidyst, of which Thore is a founding steering committee member) is in many aspects an opposite of D-Pop: it is individual, highly competitive, participation is restricted. However, the requirements to the network of people needed to “make DDD work” is identical to that of D-Pop. We are very far behind in Denmark on this compared to our Nordic neighbours. (Not to speak of other countries, where these activities are multi-million dollar industries.)

Kategorier
Afsluttet projekt Explore-projekt

Accountability Privacy Preserving Computation via Blockchain

DIREC-projekt

Accountability Privacy Preserving Computation via Blockchain

Resumé

Dette projekt har til formål at kombinere sikker multiparty-beregning og blockchain-teknikker for at muliggøre effektiv databehandling, der beskytter privatlivets fred, hvilket muliggør beregning på private data, mens der opretholdes et revisionsspor til tredjepartsverifikation. Projektet kan potentielt hjælpe med at bekæmpe diskrimination samt fange uetisk og bedragerisk adfærd.

Projektperiode: 2022-2024

Projektleder

  • Associate Professor Bernardo David
  • Department of Computer Science, ITU
  • beda@itu.dk

The project will investigate how to combine secure multiparty computation and blockchain techniques to obtain more efficient privacy-preserving computation with accountability. Privacy-preserving computation with accountability allows computation on private data (without compromising data privacy), while obtaining an audit trail that allows third parties to verify that the computation succeeded or to identify bad actors who tried to cheat. Applications include data analysis (e.g., in the context of discrimination detection and bench marking) and fraud detection (e.g. in the financial and insurance industries).

Value Creation

Using this kind of auditable continuous secure computation can help fight discrimination and catch unethical and fraudulent behaviour. Computations that advance these goals include aggregate statistics on salary information  to help identify and eliminate wage gaps (e.g. as seen in the case of the Boston wage gap study [4]), statistics on bids in an auction or bets on a gambling site to determine whether those bids or bets are fraudulent, and many others.

Organizations would not be able to carry out such computations without the use of privacy-preserving technologies due to privacy regulations; so, secure computation is necessary here. To be useful, these secure computations crucially require authenticity and consistency of the inputs. Organizations, which will not necessarily be driven by altruism, will have several incentives to participate in these computations.

First, by using secure computation to detect fraud, the participants can guard against financial loss.

Second, when participants are public organizations, honest participation (which anyone can verify) will generate positive publicity.

Insights

Kategorier
Afsluttet projekt Explore-projekt

Certifiable Controller Synthesis for Cyber-Physical Systems

DIREC-projekt

Certifiable Controller Synthesis for Cyber-physical systems

Resumé

I takt med at cyber-fysiske systemer (CPS’er) bliver stadig mere udbredte, betragtes mange af dem som sikkerhedskritiske. Vi ønsker at hjælpe CPS-producenter og regulatorer med at etablere et højt niveau af tillid til automatisk syntetiseret styresoftware til sikkerhedskritiske CPS’er. Til dette formål foreslår vi at udvide teknikken for formel certificering til controllersyntese: controllere bliver syntetiseret sammen med et sikkerhedscertifikat, som kan verificeres af højt betroede teorembevisere.

Projektperiode: 2022-2023

Projektleder

  • Assistant Professor Martijn Goorden
  • Eindhoven University of Technology
Value Creation

From a distant view point, our project aims to increase confidence in safety-critical CPSs that interact with individuals and the society at large. This is the main motivation for applying formal methods to the construction of CPSs. However, our project aims to give a unique spin to this. By cleverly combining the existing methods of controller synthesis, (timed automata) mode checking, and interactive theorem proving via means of certificate extraction and checking, we aim to facilitate the construction of control software for CPSs that ticks all the boxes: high efficiency, a very high level of trust in the safety of the system, and the possibility to independently audit the software. Given that CPSs have already conquered every sector of life, with the bulk of the development still ahead of us, we believe such an approach could make an important contribution towards technology that benefits the people.

Moreover, our approach aims to ease the interaction between the CPS industry and certification authorities. We believe it is an important duty of regulatory authorities to safeguard their citizens from failures of critical CPSs. Even so, regulation should not grind development to a halt. With our work, we hope to somewhat remedy this apparent conflict of interests. By providing a means to check the safety of synthesized controllers in a well-documented, reproducible, and efficient manner, we believe that the interaction between producers and certifying bodies could be sped up significantly, while increasing reliability at the same time. On top of that, controller synthesis has already been intensely studied and seems to be a rather mature technology from an academic perspective. However, it has barely set a foot into industrial applications. We are confident that formal certificate extraction and checking can be an important stepping stone to help controller synthesis make this jump.

This project also contributes to the objective of DIREC to bring new academic partners together in the Danish eco-system. The two principal investigators have their specialization background in two different fields (certification theory and control theory) and have not collaborated before. Thus the project strengthens the collaboration between the two fields as well as the collaboration between the two research groups at AU and AAU. This creates the opportunity for the creation of new scientific results benefiting both research fields.

Finally, we plan to generate tangible value for industry. There are many present-day use cases for control software of critical CPSs. During our project, we want to aid these use cases with controllers that tick all of the aforementioned “boxes”. This can be done by initiating several student projects and theses supporting theory development, tool implementation, and use case demonstration. The Problem Based Learning approach of Aalborg University facilitates this greatly. Furthermore, those students can use their experience
in future positions after graduating.

Kategorier
Afsluttet projekt Explore-projekt

Methodologies for scheduling and routing droplets in digital microfluidic biochips

DIREC-projekt

Methodologies for scheduling and routing droplets in digital microfluidic biochips

Resumé

Det overordnede formål med dette projekt er at definere, undersøge og give foreløbige metoder til planlægning og routing af mikroliter-store væskedråber på en plan overflade i forbindelse med digital mikrofluidik.

Hovedideen er at anvende en holistisk tilgang i designet af planlægnings- og routingmetoder, som tager hensyn til fysiske, topologiske og adfærdsmæssige begrænsninger fra den virkelige verden. Dermed skabes løsninger, der straks kan finde anvendelse i praktiske applikationer.

Projektleder: 2021-2022

Projektleder

  • Associate Professor Luca Pezzarossa
  • Department of Applied Mathematics and Computer Science, DTU
  • lpez@dtu.dk
Value Creation

DMF biochips have been in the research spotlight for over a decade. However, the technology is still not mature at a level where it can deliver extensive automation to be used in applied biochemistry processes or for research purposes. One of the main reasons is that, although rather simple in construction, DMF biochips lack a clear automated procedure for being programmed and used. The existing methodologies for programming DMF biochips require an advanced level of understanding of software programming and of the architecture of the biochip itself. These skills are not commonly found in potential target users of this technology, such as biologists and chemists.

A fully automated compilation pipeline able to translate biochemical protocols expressed in a high-level representation into the low-level biochip control sequences would enable access to the DMF technology by a larger number of researchers and professionals. The advanced scheduling and routing methodologies investigated by this project are one of the main obstacles towards broadly accessible DMF technology. This is particularly relevant for researchers and small businesses which cannot afford the large pipetting robots commonly used to automate biochemical industrial protocol. One or more DMF biochips can be programmed to execute ad-hoc repetitive and tedious laboratory tasks. Thus, freeing qualified working hours for more challenging laboratory tasks.

In addition, the scheduling and routing methodologies targeted by this project enable for online decisions, such as controlling the flow of the biochemical protocols depending upon on-the-fly sensing results from the processes occurring on the biochip. This opens for a large set of possibilities in the biochemical research field. For instance, the behavior of complex biochemical protocols can be automatically adapted during execution using decisional constructs (if-then-else) allowing for real-time protocol optimizations and monitoring.

From a scientific perspective, this project would enable cross-field collaboration, develop new methodologies, and potentially re-purpose those techniques that are well known in one research field to solve problems of another field. For the proposed project, interesting possibilities include adapting advanced routing and
graph-related algorithms or applying well-known online algorithms techniques to manage the real-time flow control nature of the biochemical protocol. The cross-field nature of the project has the potential of providing a better understanding of how advanced scheduling and routing techniques can be applied in the context of a strongly constrained application such as DMF biochips. Thus, laying the ground for novel solutions, collaborations, and further research.

Finally, it should be mentioned that the outcome of this project, or of a future larger project based on the proposed explorative research, is characterized by a concrete business value. Currently, some players have entered the market with DMF biochips built to perform a specific biochemical functionality [12,13]. A software stack that includes compilation tools supporting programmability and enabling the same DMF biochip to perform different protocols largely expands the potential market of such technology. This is not the preliminary aim of this research project, but it is indeed a long-term possibility.

Insights

Kategorier
Afsluttet projekt Explore-projekt

Automated Verification of Sensitivity Properties for Probabilistic Programs

DIREC-projekt

Automated Verification of Sensitivity Properties for Probabilistic Programs

Resumé

Sensitivitet beskriver, hvordan programudgange ændrer sig, når input ændres. Vi foreslår at undersøge nye metoder til at specificere og verificere sensitivitetsegenskaber for probabilistiske programmer, så de (a) er lette at forstå for almindelige programmører, (b) kan verificeres med automatiserede teorembevisere, og (c) omfatter egenskaber fra maskinlæring og sikkerhedslitteraturen.

Projektperiode: 2022-2023

Projektleder

  • Associate Professor Christoph Matheja
  • Department of Applied Mathematics and Computer Science, DTU
  • chmat@dtu.dk

og

  • Postdoc Alejandro Aguirre
  • Department of Computer Science, AU
  • alejandro@cs.au.dk

Our overall objective is to explore how automated verification of sensitivity properties of probabilistic programs can support developers in increasing the trust in their software through formal assurances.

Probabilistic programs are programs with the ability to sample from probability distributions. Examples include randomized algorithms, where sampling is exploited to ensure that expensive executions have a low probability, cryptographic protocols, where randomness is essential for encoding secrets, and statistics, where programs are becoming a popular alternative to graphical models for describing complex distributions.

The sensitivity of a program determines how its outputs are affected by changes to its input; programs with low sensitivity are robust against fluctuations in their input – a key property for improving trust in software. Minor input changes should, for example, not affect the result of a classifier learned from training data. In the probabilistic setting, the output of a program depends not only on the input but also on the source of randomness. Hence, the notion of sensitivity – as well as techniques for reasoning about it – needs refinement.

Automated verification takes a deductive approach to proving that a program satisfies its specification: users annotate their programs with logical assertions; a verifier then generates verification conditions (VCs) whose validity implies that the program’s specification holds. Deductive verifiers are more complete and more scalable than fully automatic techniques but require significant user interaction. The main challenge for users of automated verifiers lies in finding suitable intermediate assertions, particularly loop invariants, such that an automated theorem prover can discharge the generated VCs. A significant challenge for developers of automated verifiers is to keep the amount and complexity of necessary annotations as low as possible.

Previous work [1] co-authored by the applicants provides a theoretical framework for reasoning about the sensitivity of probabilistic programs: the above paper presents a calculus to carry out “pen-and-paper” proofs of sensitivity in a principled and syntax-directed manner. The proposed technique deals with sampling instructions by requiring users to identify suitable probabilistic couplings, which act as synchronization points, on top of finding loop invariants. However, the technique is limited in the sense that it does not provide tight sensitivity bounds when changes to the input cause a program to take a different branch on a conditional.

Our project has four main goals. First, we will develop methodologies that do not suffer from the limitations of [1]. We believe that conditional branching can be treated by carefully tracking the possible divergence.

Second, we will develop an automated verification tool for proving sensitivity properties of probabilistic programs. The tool will generate VCs based on the calculus from [1], which will be discharged using an SMT solver. In designing the specification language, we aim to achieve a balance so that (a) users can conveniently specify synchronization points for random samples (via so-called probabilistic couplings) and (b) existing solvers can prove the resulting VCs.

Third, we aim to aid the verification process by assisting users in finding synchronization points. Invariant synthesis has been extensively studied in the case of deterministic programs. Similarly, coupling synthesis has been recently studied for the verification of probabilistic programs. We believe these techniques can be adapted to the study of sensitivity.

Finally, we will validate the overall verification system by applying it to case studies from machine learning, statistics, and randomized algorithms.

Insights

Kategorier
Afsluttet projekt Explore-projekt

Understanding Biases and Diversity of Big Data used for Mobility Analysis

DIREC-projekt

Understanding Biases and Diversity of Big Data used for Mobility Analysis

Resumé

Vores evne til at indsamle, lagre og analysere store mængder data er steget markant i de sidste to årtier, og i dag spiller big data en kritisk rolle i størstedelen af statistiske algoritmer. Desværre er vores forståelse af bias i data ikke fulgt med. Selvom der er sket store fremskridt i udviklingen af nye modeller til at analysere data, har der været langt mindre fokus på at forstå de grundlæggende mangler ved big data.

Dette projekt vil kvantificere de bias og usikkerheder, der er forbundet med menneskelig mobilitetsdata indsamlet gennem digitale midler, såsom smartphone GPS-spor, mobiltelefondata og sociale mediedata.

I sidste ende ønsker vi at stille spørgsmålet: er det muligt at rette op på big mobility data gennem en grundlæggende forståelse af, hvordan bias manifesterer sig?

Projektperiode: 2022-2024

Projektleder

  • Associate Professor Vedran Sekara
  • Department of Computer Science, ITU
  • vsek@itu.dk

Value creation

We expect this project to have a long-lasting scientific and societal impact. The scientific impact of this work will allow us to explicitly model bias in algorithmic systems relying on human mobility data and provide insights into which population are left out. For example, it will allow us to correct for gender, wealth, age, and other types of biases in data globally used for epidemic modeling, urban planning, and many other usecases.

Further, having methods to debias data will allow us to understand what negative impacts results derived from biased data might have. Given the universal nature of bias, we expect our developed debiasing frameworks will also pave the way for quantitative studies of bias in other realms of data science.

The societal impact will be actionable recommendations provided to policy makers regarding: 1) guidelines for how to safely use mobility datasets in data-driven decision processes, 2) tools (including statistical and interactive visualizations) for quantifying the effects of bias in data, and 3) directions for
building fairer and equitable algorithm that rely on mobility data.

It is important to address these issues now, because in their “Proposal for a Regulation on a European approach for Artificial Intelligence” from April 2021 the European Commission (European Union) outlines potential future regulations for addressing the opacity, complexity, bias, and unpredictability of algorithmic systems.

This document states that high-quality data is essential for algorithmic performance and suggest that any dataset should be subject to appropriate data governance and management practices, including examination in view of possible biases. This implies that in the future businesses and governmental agencies will need to have data-audit methods in place. Our project addresses this gap and provides value by
developing methodologies to audit mobility data for different types of biases — producing tools which Danish society and Danish businesses will benefit from.

Insights