Kategorier
Explore project

Cyber-Physical Systems with Humans in the Loop

Project type: Explore Project

Cyber-Physical Systems with Humans in the Loop

Summary

Constructing cyber-physical systems with humans in the loop is important in many application areas to enable a close co-operation between humans and machines. However, there are also many challenges to overcome when constructing such systems with current software technologies and human-centered design approaches. To foster collaboration on the topic the project will study state-of-the-art and map out challenges which is important for Danish industry to address in future work.

Value Creation

Scientific value: The project will provide a better terminology and a common understanding of state-of-theart across several areas of research within DIREC and disseminate this knowledge to the scientific community.

Capacity building: The project will establish new collaboration setups within DIREC and involve master students in the activities.

Business value: The project will in workshops disseminate knowledge to Danish industry and identify cases that could be relevant areas of collaboration for DIREC with Danish Industry in future larger projects. The project will among others connect to the community involved in the Nordic IoT Center.

Participants

Project Manager

Mikkel Baun Kjærgaard

Professor

University of Southern Denmark
The Maersk Mc-Kinney Moller Institute

E: mbkj@mmmi.sdu.dk

Jan Madsen

Professor

Technical University of Denmark
DTU Compute

Peter Gorm Larsen

Professor

Aarhus University
Department of Electrical and Computer Engineering

Torkild Clemmensen

Professor

Copenhagen Business School
Department of Digitalization

Kim Guldstrand Larsen

Professor

Aalborg University
Department of Computer Science

Kategorier
Explore project

Re-Use of Robotic-data in Production through search, simulation and learning

Project type: Explore Project

Re-Use of Robotic-data in Production through search, simulation and learning

Project Description

We investigate – in close co-operation with four Danish companies, the cluster organization Odense Robotics and the research platform MADE – underlying problems of data re-use on assembly in production.

In contrast to other fields in AI, the potential of exploiting large data collections is not realized in robotics yet. We aim at analyzing the underlying scientific and technical challenges as well as associated legal and privacy issues by means of three half days meetings of university partners and companies, one public workshop, and the preparation of four deliverables.

The potential of re-use of data of past robot executions in industrial robotics is not realized yet due to scientific, technical and IPR issues. In this project, we conduct – in close co- operation with key industrial partners – a first investigation into the underlying problems to prepare a larger project application which is grounded in today’s challenges faced by companies.

The investigation is done by means of three meetings of university partners and companies, one public workshop – which is planned to be open to DIREC, Odense Robotics and MADE (Manufacturing Academy of Denmark, www.made.dk) participants, and the preparation of three deliverables which address the underlying scientific, technical, privacy and legal issues. The ReRoPro project operates in the crossfield of classical AI and industrial robotics and also aims at bridging between DIREC and MADE by exploiting synergies between these two large national consortia in the area of AI and Robotics. A fourth deliverable will be a grant proposal with significant company co-financing. ReRoPro gets support by Odense Robotics and MADE.

In contrast to, e.g., computer vision or speech recognition, there are only very few examples where data has been exploited across different robotics applications. In the area of computer vision – in particular, in the field of deep learning in the last decade – efficient software-modules for, e.g., object recognition and human pose estimation have been developed. The same holds for speech recognition and other areas. The availability and exploitation of large amounts of data has been crucial for these successes. However, similar successes have not yet emerged in robotics. The main reason for that are the particularities of robot data and the close connection to hardware infrastructure, which make it more difficult to transfer experience from one setup to another. In addition, IPR issues are a potential hurdle since the data about objects and processes is of great value for the companies. Therefore, companies are hesitant to share such data with potential competitors.

As a consequence of the lack of availability of such structured data, today’s robot applications have to be built up from scratch for every new task. That leads to long and expensive set-up times, and thus limits the use of robot solutions. This, in turn, pushes production to countries with lower salaries than Denmark. If, however, data of past executions could be made available in a reusable format, it could be harnessed in the future for the robot control methods, which could reduce the setup time of the application and subsequently makes it cheaper and broader.

In this project, we perform initial steps to address the complex issues involved in the problems outlined above by (1) performing first steps in defining appropriate data structures for data storage and retrieval as well as a high-level architecture that allows for exploitation of robot data, (2) formulating a number of learning problems with our three industrial partners routed in the need of industry that will constitute the use cases addressed in future projects, and (3) investigate the IPR-issues connected to the sharing of sensitive data. By means of three half-day meetings of AI experts, robotic researchers and companies and a public workshop the problems at hand will be explored and the results will be put down in three deliverables addressing the three issues mentioned above. A larger project proposal will be a fourth deliverable.

Value Creation

Scientific value creation 

Scientific insight into the problem of re-use of data in robotics: The project will gather experts in the field and establish knowledge on the problem at hand in a strong consortium that plans to work on the problem in the coming years. The main challenges faced will be analyzed and spelled out and first steps towards valid data structures will be made already during the project period and summarized in one public deliverable. Business value creation
The formulation of a set of learning problems grounded in the needs of industry: There have been past attempts to address the problem of re-use of robot data in the context of the general problem of cognition (e.g., during the Cognitive System Calls a decade ago). An important difference of ReRoPro and these past projects will be that we will root the problem in tasks that are relevant for industry already now. These problems also have much lower complexity than the general cognition problem which increases the success chances. In one deliverable, we will define a small set of learning problems which, when successfully addressed by our generic approach, will generate large value for the companies involved. The learning problems will be also specified to a degree that we can estimate that our approach will succeed with high likelihood. Both is required for gathering larger funding for our approach in the future.

The awareness of legal challenges connected to the re-use of data: A prerequisite for companies to become involved will be that their production data will be protected. There are different means to address these issues (keeping data local, abstracting data into models which are shared, etc.). The awareness of these issues and possible solutions thereof are crucial for assuring the required trust for the companies involved and is therefore
a basis for the business value to be generated.

Capacity building

Shaping of a larger proposal on re-use of data in assembly with strong company involvement: The explorer project ReRoPro is supposed to be the basis for forming a consortium for a rather large national project proposal with significant company involvement which is supposed to be handed in already in 2022.
National and international visibility of the DIREC consortium by means of a one-day workshop on the re-use of robot data: As part of the project, we will organize a DIREC-workshop with top-notch national and international experts on the re-use of robotic data. This will give the DIREC project national and international visibility. Exploiting synergies between the MADE and DIREC consortium: ReRoPro is positioned in the interface between AI and industrial robotics, The MADE-consortium (Manufacturing Academy of Denmark), the leading research platform in Denmark in industrial robotics, will support ReRoPro. By that, the two large national consortia DIREC and MADE, both funded by InnovationFonden Danmark, can exploit synergies
between them.

Capacity Building for Education 

We will integrate the insights gained during the ReRoPro project in our robotic educations, both on the master and PhD level. Since we have already a number of master and PhD theses running that touch the topic addressed in ReRoPro, the project will facilitate the systemization of the data handling in these and future theses. Moreover, since data management and data re-use is becoming increasingly important for robot applications in industry, we will integrate this aspect more systematically in our robotic courses. ReRoPro will also inspire future master theses at SDU Robotics and Software and KU/AAU computer science.

Participants

Project Manager

Norbert Krüger

Professor

University of Southern Denmark
The Maersk Mc-Kinney Moller Institute

E: norbert@mmmi.sdu.dk

Aljaz Kramberger

Assistant Professor

University of Southern Denmark
The Maersk Mc-Kinney Moller Institute

Jakob Wilm

Associate Professor

University of Southern Denmark
The Maersk Mc-Kinney Moller Institute

Mikkel Baun Kjærgaard

Professor

University of Southern Denmark
The Maersk Mc-Kinney Moller Institute

Anders Lyhne Kristensen

Professor

University of Southern Denmark
The Maersk Mc-Kinney Moller Institute

Kenny Erleben

Associate Professor

University of Copenhagen
Department of Computer Science

Sune Darkner

Associate Professor

University of Copenhagen
Department of Computer Science

Thomas Dyhre Nielsen

Professor

Aalborg University
Department of Computer Science

Alvaro Torralba

Associate Professor

Aalborg University
Department of Computer Science

Kategorier
Explore project

DeCoRe: Tools and Methods for the Design and Coordination of Reactive Hybrid Systems

Project type: Explore Project

DeCoRe: Tools and Methods for the Design and Coordination of Reactive Hybrid Systems

Summary

A recurring problem of digitalised industries is to design and coordinate hybrid systems that include IoT (Internet of Things), edge, and cloud solutions. Currently adopted methods and tools are not effective to this end, because they rely too much on informal specifications that are manually written and interpreted by humans.

We aim at exploring the applicability of forefront technologies and methods developed at SDU, KU, and AAU for the design of reactive hybrid IoT-edge-cloud architectures in Danish industry. These technologies are based on unambiguous formal languages, which can be processed by computers to check for desirable design properties (such as compatibility of software interfaces) and to deploy components for monitoring the correct functioning of systems. Adopting these techniques has shown to substantially increase the productivity of digital industries (for example, up to 4x increase in development speed).

We will:

(a) carry out a concrete use case with a partner company (Sanovo Technology Group)

(b) initiate knowledge sharing on this topic among AAU, KU, and SDU through workshops

(c) communicate our findings to the rest of the DIREC community.

Value Creation

Scientific value: The scientific value of the project is twofold:

(a) concrete knowledge on the advantages and potential challenges brought by the application of cutting-edge techniques like Jolie for the development of hybrid systems (IoT-edge-cloud) in the Danish industry (using Sanovo Technology Group for the case study); and

(b) knowledge on the synergies and future directions for the integration of forefront scientific methods for hybrid systems developed by Danish universities (Jolie, UPPAAL, DCR Graphs). Providing a perspective that comes from concrete industrial experience, with substaintiated needs, has significant potential to influence the future development of both research and industrial development in Denmark.

Capacity building: Companies will thus benefit from an increased number of students that they can hire to satisfy their needs with respect to hybrid systems. Universities benefit by gaining sustainable candidates for PhD positions in future projects connected to this exploration.

Business and societal value: Due to the growth potential in solutions for automation and data intensive processing solutions, this project will strengthen Danish competitiveness through a reduced cost of developing deploying and running IoT and cloud software. Potentially, this could lead to increased export of IT products and services.

Participants

Project Manager

Fabrizio Montesi

Professor

University of Southern Denmark
Department of Mathematics and Computer Science

E: fmontesi@imada.sdu.dk

Thomas Troels Hildebrandt

Professor

University of Copenhagen
Department of Computer Science

Kim Guldstrand Larsen

Professor

Aalborg University
Department of Computer Science

Kategorier
Educational project

Initiatives to improve recruitment and retention of IT students

Project type: Explore Project

Initiatives to improve recruitment and retention of IT students

Denmark needs more IT specialists. But how do we get more young people to study computer science and become IT specialist? This project, consisting of two subprojects, focuses on initiatives that can improve both recruitment and retention of a larger but also more diverse group of young people e.g., female students and students without prior programming experience.

Diversity or Not: Heterogeneous vs Homogeneous study groups
Summary

The first subproject Diversity or Not: Heterogeneous vs Homogeneous Student Groups? will study the effect of diversity on the formation of CS student groups. The intent is to uncover evidence to issue recommendations on how to best form project groups. We expect this knowledge to be beneficial for the recruitment and retention of students as well as for the diversity of students.

Value Creation

We expect the outcomes of this project will create significant value for primarily the Danish universities, but also for the Danish tech industry (technology companies). The project intends to derive research-based recommendations on how to best form (student) project groups. Since group work is so widespread in Computer Science education in all of Denmark to foster communication and collaboration skills in connection with a problem, it is important to figure out what works best. This will strengthen the CS education in Denmark.

Studying the impact of diversity on project groups will also be important as a proxy for professional groups in a work context, beyond university (with the obvious external threats to validity of this generalization). We expect this knowledge to be beneficial for the recruitment and retention of students as well as for the diversity of the students (e.g., female students and students without prior programming experience). Aside from the experiments themselves and their findings, we intend to also create and publsh (and seek independent generic approval of) generic experimental protocols for how to ethically and responsibly conduct such group diversity-performance experiments. This includes how to quantify group diversity and group performance. We imagine these generic experimental protocols would be relevant for other studies and companies seeking to specialize them in order to conduct their own more specific instances of the experiments. This also includes ethical considerations surrounding similar student experiments and how to make them ethically safe(r)

D-Pop – A Danish Annual Programming and Problemsolving Event

Summary

The second subproject D-Pop – A Danish Annual Programming and Problem Solving Event will plan, organize, and implement physical D-Pop events at Danish CS departments aimed at young people who are beginning programmers at all levels. The participants get increased programming skills and another perspective on programming and problem solving because focus is on collaboration, creativity, and curiosity.

We expect the events to have a positive effect on recruitment and retention of students as well as for the diversity of students.

Value Creation

The expected results of D-Pop are:

1. Dramatically increased programming skills among participants. This is the expected outcome of just participating, akin to training in any other skill, and includes improved programming language mastery, problem solving skills, resilience, collaboration skills, debugging, and computational problem solving (in particular, algorithmic thinking). This competence boost is independent of the rung of the competence ladder on which the participant starts. I don’t need to reiterate the problems with recruitment of technically competent IT professionals in Denmark.

2. Increased exposure and recruitment. D-Pop complements the existing pallette of outreach and recruitment activities currently used by Danish CS departments. Compared with similar events, D-Pop content is designed with a focus on immediate, satisfying, and positive feedback to beginning programmers, but in a way that is both honest and values competence, agency, and collaboration. Scalability is built into D-Pop’s infrastructure (both technical and social from the start.

3. Establishment of a national network of problem setters. The value of this extends beyond D-Pop and immediately includes teaching material for high schools and universities. For another example, the Danish High School Informatics Olympiad (Dansk datalogidyst, of which Thore is a founding steering committee member) is in many aspects an opposite of D-Pop: it is individual, highly competitive, participation is restricted. However, the requirements to the network of people needed to “make DDD work” is identical to that of D-Pop. We are very far behind in Denmark on this compared to our Nordic neighbours. (Not to speak of other countries, where these activities are multi-million dollar industries.)

Participants

Project Manager - project 1

Claus Brabrand

Associate Professor

IT University of Copenhagen
Department of Computer Science

E: brabrand@itu.dk

Project Manager - project 2

Thore Husfeldt

Professor

IT University of Copenhagen
Department of Computer Science

E: thore@itu.dk

Louise Barkhuus

Professor

IT University of Copenhagen
Department of Computer Science

Kim Normann Andersen

Professor

Copenhagen Business School
Department of Digitalization

Jacob Nørbjerg

Associate Professor

Copenhagen Business School
Department of Digitalization

Samuel Alberg Thrysøe

Associate Professor

Aarhus University
Department of Computer Science

Kategorier
Explore project

Algorithms education via animation videos

Project type: Explore Project

Algorithms education via animation videos

Summary

Lectures on algorithms traditionally consist of blackboard/slide talks and reading material. This mode however need not be optimal for all students: Several highly popular YouTube channels for mathematics and other scientific content (e.g., 3blue1brown, Numberphile, Veritasium) with millions of views indicate that learners may respond very positively to professionally produced educational videos. This project aims at creating and evaluating an initial library of such videos to supplement teaching in algorithms.

Value Creation

This project primarily creates value in computer science education. It allows students to approach abstract concepts within CS through online videos, a familiar medium for digital natives. This medium gives educators and learners an alternative way of approaching material that is inherently abstract and generally considered hard to grasp. In alignment with the goals of workstream 12, this project also supports scaling the teaching of algorithms topics through digital technology. With a focus on clear and engaging communication and visual design, we also expect this alternative teaching mode to inspire students that are otherwise deterred from the classical technical image of CS. This may have beneficial effects on student diversity. With the necessary infrastructure set up, the project serves as a basis for further research projects and BSc/MSc theses on algorithm visualization and education. This attracts more students to algorithms topics, and possibly also to research in algorithms as PhD students. Examples for previous theses supervised by the PI are shown at the bottom of this page. With their educational value and production quality, coupled with our dissemination efforts, the videos indirectly also serve as outreach and publicity for CS education in Denmark, the DIREC project, and computer science in general. This may attract the general public and prospective students to computer science topics. Here, we also rely on the outreach expertise of project partner Thore Husfeldt. Formal publications in computer science education conferences/journals may also follow from this project

Participants

Project Manager

Radu-Christian Curticapean

Assistant Professor

IT University of Copenhagen
Department of Computer Science

E: racu@itu.dk

Thore Husfeldt

Professor

IT University of Copenhagen
Department of Computer Science

Nutan Limaye

Associate Professor

IT University of Copenhagen
Department of Computer Science

Christian Wulff-Nilsen

Associate Professor

University of Copenhagen
Department of Computer Science

Mikkel Abrahamsen

Assistant Professor

University of Copenhagen
Department of Computer Science

Philip Bille

Professor

Technical University of Denmark
DTU Compute

Inge Li Gørtz

Professor

Technical University of Denmark
DTU Compute

Eva Rotenberg

Associate Professor

Technical University of Denmark
DTU Compute

Srikanth Srinivasan

Associate Professor

Aarhus University
Department of Computer Science

Kategorier
Explore project

Accountability Privacy Preserving Computation via Blockchain

Project type: Explore Project

Accountability Privacy Preserving Computation via Blockchain

Summary

We will investigate how to combine secure multiparty computation and blockchain techniques to obtain more efficient privacy-preserving computation with accountability. Privacy-preserving computation with accountability allows computation on private data (without compromising data privacy), while obtaining an audit trail that allows third parties to verify that the computation succeeded or to identify bad actors who tried to cheat. Applications include data analysis (e.g., in the context of discrimination detection and bench marking) and fraud detection (e.g. in the financial and insurance industries).

Value Creation

Using this kind of auditable continuous secure computation can help fight discrimination and catch unethical and fraudulent behaviour. Computations that advance these goals include aggregate statistics on salary information  to help identify and eliminate wage gaps (e.g. as seen in the case of the Boston wage gap study [4]), statistics on bids in an auction or bets on a gambling site to determine whether those bids or bets are fraudulent, and many others. Organizations would not be able to carry out such computations without the use of privacy-preserving technologies due to privacy regulations; so, secure computation is necessary here. To be useful, these secure computations crucially require authenticity and consistency of the inputs. Organizations, which will not necessarily be driven by altruism, will have several incentives to participate in these computations. First, by using secure computation to detect fraud, the participants can guard against financial loss. Second, when participants are public organizations, honest participation (which anyone can verify) will generate positive publicity.

Participants

Sophia Yakoubov

Assistant Professor

Aarhus University
Department of Computer Science

E: sophia.yakoubov@cs.au.dk

Tore Frederiksen

Senior Cryptography Engineer

The Alexandra Institute

E: tore.frederiksen@alexandra.dk

Bernardo David

Associate Professor

IT University of Copenhagen
Department of Computer Science

E: beda@itu.dk

Mads Schaarup Andersen

Senior Usable Security Expert

The Alexandra Institute

Laura Lynggaard Nielsen

Senior Anthropologist

The Alexandra Institute

Louise Barkhuus

Professor

IT University of Copenhagen
Department of Computer Science

Kategorier
Explore project

Certifiable Controller Synthesis for Cyber-Physical Systems

Project type: Explore Project

Certifiable Controller Synthesis for Cyber-Physical Systems

Summary

As cyber-physical systems (CPSs) are becoming ever more ubiquitous, many of them are considered safetycritical. We want to help CPS manufacturers and regulators with establishing high levels of trust in automatically synthesized control software for safety-critical CPSs. To this end, we propose to extend the technique of formal certification towards controller synthesis: controllers are synthesized together with a safety certificate that can be verified by highly trusted theorem provers.

Value Creation

From a distant view point, our project aims to increase confidence in safety-critical CPSs that interact with individuals and the society at large. This is the main motivation for applying formal methods to the construction of CPSs. However, our project aims to give a unique spin to this. By cleverly combining the existing methods of controller synthesis, (timed automata) mode checking, and interactive theorem proving via means of certificate extraction and checking, we aim to facilitate the construction of control software for CPSs that ticks all the boxes: high efficiency, a very high level of trust in the safety of the system, and the possibility to independently audit the software. Given that CPSs have already conquered every sector of life, with the bulk of the development still ahead of us, we believe such an approach could make an important contribution towards technology that benefits the people.

Moreover, our approach aims to ease the interaction between the CPS industry and certification authorities. We believe it is an important duty of regulatory authorities to safeguard their citizens from failures of critical CPSs. Even so, regulation should not grind development to a halt. With our work, we hope to somewhat remedy this apparent conflict of interests. By providing a means to check the safety of synthesized controllers in a well-documented, reproducible, and efficient manner, we believe that the interaction between producers and certifying bodies could be sped up significantly, while increasing reliability at the same time. On top of that, controller synthesis has already been intensely studied and seems to be a rather mature technology from an academic perspective. However, it has barely set a foot into industrial applications. We are confident that formal certificate extraction and checking can be an important stepping stone to help controller synthesis make this jump.

This project also contributes to the objective of DIREC to bring new academic partners together in the Danish eco-system. The two principal investigators have their specialization background in two different fields (certification theory and control theory) and have not collaborated before. Thus the project strengthens the collaboration between the two fields as well as the collaboration between the two research groups at AU and AAU. This creates the opportunity for the creation of new scientific results benefiting both research fields.

Finally, we plan to generate tangible value for industry. There are many present-day use cases for control software of critical CPSs. During our project, we want to aid these use cases with controllers that tick all of the aforementioned “boxes”. This can be done by initiating several student projects and theses supporting theory development, tool implementation, and use case demonstration. The Problem Based Learning approach of Aalborg University facilitates this greatly. Furthermore, those students can use their experience
in future positions after graduating.

Participants

Martijn Goorden

Postdoc

Aalborg University
Department of Computer Science

E: mgoorden@cs.aau.dk

Simon Wimmer

Postdoc

Aarhus University
Department of Computer Science

E: swimmer@cs.au.dk

Kategorier
Explore project

Methodologies for scheduling and routing droplets in digital microfluidic biochips

Project type: Explore Project

Methodologies for scheduling and routing droplets in digital microfluidic biochips

Summary

The overall purpose of this project is to define, investigate, and provide preliminary methodologies for scheduling and routing microliter-sized liquid droplets on a planar surface in the context of digital microfluidics.

The main idea is to use a holistic approach in the design of scheduling and routing methodologies that takes into account real-world physical, topological, and behavioral constraints. Thus, producing solutions that can immediately find use in practical applications.

Value Creation

DMF biochips have been in the research spotlight for over a decade. However, the technology is still not mature at a level where it can deliver extensive automation to be used in applied biochemistry processes or for research purposes. One of the main reasons is that, although rather simple in construction, DMF biochips lack a clear automated procedure for being programmed and used. The existing methodologies for programming DMF biochips require an advanced level of understanding of software programming and of the architecture of the biochip itself. These skills are not commonly found in potential target users of this technology, such as biologists and chemists.

A fully automated compilation pipeline able to translate biochemical protocols expressed in a high-level representation into the low-level biochip control sequences would enable access to the DMF technology by a larger number of researchers and professionals. The advanced scheduling and routing methodologies investigated by this project are one of the main obstacles towards broadly accessible DMF technology. This is particularly relevant for researchers and small businesses which cannot afford the large pipetting robots commonly used to automate biochemical industrial protocol. One or more DMF biochips can be programmed to execute ad-hoc repetitive and tedious laboratory tasks. Thus, freeing qualified working hours for more challenging laboratory tasks.

In addition, the scheduling and routing methodologies targeted by this project enable for online decisions, such as controlling the flow of the biochemical protocols depending upon on-the-fly sensing results from the processes occurring on the biochip. This opens for a large set of possibilities in the biochemical research field. For instance, the behavior of complex biochemical protocols can be automatically adapted during execution using decisional constructs (if-then-else) allowing for real-time protocol optimizations and monitoring.

From a scientific perspective, this project would enable cross-field collaboration, develop new methodologies, and potentially re-purpose those techniques that are well known in one research field to solve problems of another field. For the proposed project, interesting possibilities include adapting advanced routing and
graph-related algorithms or applying well-known online algorithms techniques to manage the real-time flow control nature of the biochemical protocol. The cross-field nature of the project has the potential of providing a better understanding of how advanced scheduling and routing techniques can be applied in the context of a strongly constrained application such as DMF biochips. Thus, laying the ground for novel solutions, collaborations, and further research.

Finally, it should be mentioned that the outcome of this project, or of a future larger project based on the proposed explorative research, is characterized by a concrete business value. Currently, some players have entered the market with DMF biochips built to perform a specific biochemical functionality [12,13]. A software stack that includes compilation tools supporting programmability and enabling the same DMF biochip to perform different protocols largely expands the potential market of such technology. This is not the preliminary aim of this research project, but it is indeed a long-term possibility.

Participants

Project Manager

Luca Pezzarossa

Assistant Professor

Technical University of Denmark
DTU Compute

E: lpez@dtu.dk

Eva Rotenberg

Associate Professor

Technical University of Denmark
DTU Compute

Lene M. Favrholdt

Associate Professor

University of Southern Denmark
Department of Mathematics and Computer Science

Kategorier
Explore project

Automated Verification of Sensitivity Properties for Probabilistic Programs

Project type: Explore Project

Automated Verification of Sensitivity Properties for Probabilistic Programs

Sensitivity measures how much program outputs vary when changing inputs. We propose exploring novel methodologies for specifying and verifying sensitivity properties of probabilistic programs such that they (a) are comprehensible to everyday programmers, (b) can be verified using automated theorem provers, and (c) cover properties from the machine learning and security literature.

This work will bring together two junior researchers who recently arrived in Denmark and obtained their PhDs working on probabilistic verification.

Project description

Our overall objective is to explore how automated verification of sensitivity properties of probabilistic programs can support developers in increasing the trust in their software through formal assurances.

Probabilistic programs are programs with the ability to sample from probability distributions. Examples include randomized algorithms, where sampling is exploited to ensure that expensive executions have a low probability, cryptographic protocols, where randomness is essential for encoding secrets, and statistics, where programs are becoming a popular alternative to graphical models for describing complex distributions.

The sensitivity of a program determines how its outputs are affected by changes to its input; programs with low sensitivity are robust against fluctuations in their input – a key property for improving trust in software. Minor input changes should, for example, not affect the result of a classifier learned from training data. In the probabilistic setting, the output of a program depends not only on the input but also on the source of randomness. Hence, the notion of sensitivity – as well as techniques for reasoning about it – needs refinement.

Automated verification takes a deductive approach to proving that a program satisfies its specification: users annotate their programs with logical assertions; a verifier then generates verification conditions (VCs) whose validity implies that the program’s specification holds. Deductive verifiers are more complete and more scalable than fully automatic techniques but require significant user interaction. The main challenge for users of automated verifiers lies in finding suitable intermediate assertions, particularly loop invariants, such that an automated theorem prover can discharge the generated VCs. A significant challenge for developers of automated verifiers is to keep the amount and complexity of necessary annotations as low as possible.

Previous work [1] co-authored by the applicants provides a theoretical framework for reasoning about the sensitivity of probabilistic programs: the above paper presents a calculus to carry out “pen-and-paper” proofs of sensitivity in a principled and syntax-directed manner. The proposed technique deals with sampling instructions by requiring users to identify suitable probabilistic couplings, which act as synchronization points, on top of finding loop invariants. However, the technique is limited in the sense that it does not provide tight sensitivity bounds when changes to the input cause a program to take a different branch on a conditional.

Our project has four main goals. First, we will develop methodologies that do not suffer from the limitations of [1]. We believe that conditional branching can be treated by carefully tracking the possible divergence. Second, we will develop an automated verification tool for proving sensitivity properties of probabilistic programs. The tool will generate VCs based on the calculus from [1], which will be discharged using an SMT solver. In designing the specification language, we aim to achieve a balance so that (a) users can conveniently specify synchronization points for random samples (via so-called probabilistic couplings) and (b) existing solvers can prove the resulting VCs. Third, we aim to aid the verification process by assisting users in finding synchronization points. Invariant synthesis has been extensively studied in the case of deterministic programs. Similarly, coupling synthesis has been recently studied for the verification of probabilistic programs [2]. We believe these techniques can be adapted to the study of sensitivity. Finally, we will validate the overall verification system by applying it to case studies from machine learning, statistics, and randomized algorithms.

 

Participants

Alejandro Aguirre

Postdoc

Aarhus University
Department of Computer Science

Christoph Matheja

Assistant Professor

Technical University of Denmark
DTU Compute

Kategorier
Explore project

Understanding Biases and Diversity of Big Data used for Mobility Analysis

Project type: Explore Project

Understanding Biases and Diversity of Big Data used for Mobility Analysis

Summary

Our capabilities to collect, store and analyze vast amounts of data have greatly increased in the last two decades, and today big data plays a critical role in a large majority of statistical algorithms. Unfortunately, our understanding of biases in data has not kept up. While there has been lot of progress in developing new models to analyze data, there has been much less focus on understanding the fundamental shortcomings of big data.

This project will quantify the biases and uncertainties associated with human mobility data collected through digital means, such a smartphone GPS traces, cell phone data, and social media data.

Ultimately, we want to ask the question: is it possible to fix big mobility data through a fundamental understanding of how biases manifest themselves?

Value Creation

We expect this project to have a long-lasting scientific and societal impact. The scientific impact of this work will allow us to explicitly model bias in algorithmic systems relying on human mobility data and provide insights into which population are left out. For example, it will allow us to correct for gender, wealth, age, and other types of biases in data globally used for epidemic modeling, urban planning, and many other usecases. Further, having methods to debias data will allow us to understand what negative impacts results derived from biased data might have. Given the universal nature of bias, we expect our developed debiasing frameworks will also pave the way for quantitative studies of bias in other realms of data science.

The societal impact will be actionable recommendations provided to policy makers regarding: 1) guidelines for how to safely use mobility datasets in data-driven decision processes, 2) tools (including statistical and interactive visualizations) for quantifying the effects of bias in data, and 3) directions for building fairer and equitable algorithm that rely on mobility data.

It is important to address these issues now, because in their “Proposal for a Regulation on a European approach for Artificial Intelligence” from April 2021 the European Commission (European Union) outlines potential future regulations for addressing the opacity, complexity, bias, and unpredictability of algorithmic systems. This document states that high-quality data is essential for algorithmic performance and suggest that any dataset should be subject to appropriate data governance and management practices, including examination in view of possible biases. This implies that in the future businesses and governmental agencies will need to have data-audit methods in place. Our project addresses this gap and provides value by developing methodologies to audit mobility data for different types of biases — producing tools which Danish society and Danish businesses will benefit from.

Participants

Project Manager

Vedran Sekara

Assistant Professor

IT University of Copenhagen
Department of Computer Science

E: vsek@itu.dk

Laura Alessandretti

Associate Professor

Technical University of Denmark
DTU Compute

Manuel Garcia-Herranz

Chief Scientist

UNICEF
New York

Elisa Ormodei

Assistant Professor

Central European University