Kategorier
SciTech project

Privacy and Machine Learning

Project type: SCITECH Project

Privacy and Machine Learning

There is an unmet need for decentralised privacy-preserving machine learning. Cloud computing has great potential, however, there is a lack of trust in the service  providers and there is a risk of data breaches. A lot of data are private and stored locally for good reasons, but combining the information in a global machine learning (ML) system could lead to services that benefit all. For instance, consider a consortium of banks that want to improve fraud detection by pooling their customers’ payment data
and merge these with data from, e.g., Statistics Denmark.

However, for competitive reasons the banks want to keep their customers’ data secret and Statistics Denmark is not allowed to share the required sensitive data. As another example, consider patient information (e.g., medical images) stored at hospitals. It would be great to build diagnostic and prognostic tools using ML based on these data, however, the data can typically not be shared.

The research aim of the project is the development of AI methods and tools that enable industry to develop new solutions for automated image-based quality assessment. End-to-end learning of features and representations for object classification by deep neural networks can lead to significant performance improvements. Several recent mechanisms have been developed for further improving performance and reducing the need for manual annotation work (labelling) including semi-supervised learning strategies and data augmentation.

Semi-supervised learning  combines generative models that are trained without labels (unsupervised learning), application of pre-trained networks (transfer learning) with supervised learning on small sets of labelled data. Data augmentation employs both knowledge based transformations, such as translations and rotations and more general learned transformations like parameterised “warps” to increase variability in the training data and increase robustness to natural variation.

Researching secure use of sensitive data will benefit society at large. CoED-based ML solves the fundamental problem of keeping private input data private while still enabling the use of the most applied analytical tools. The CoED privacy-preserving technology reduces the risk of data breaches. It allows for secure use of cloud computing, with no single point of failure, and removes the fundamental cloud security problem of missing trust in service providers.

The project will bring together leading experts in CoED and ML. It may serve as a starting point for attracting additional national and international funding, and it will build up competences highly relevant for Danish industry. The concepts developed in the project may change how organisations collaborate and allow for innovative ways of using data, which can increase the competitiveness of Danish companies relative to large international players.

October 1, 2020 – September 31, 2024 – 3,5 years.

Total budget DKK 4,7 / DIREC investment DKK 3,22

Participants

Project Manager

Peter Scholl

Assistant Professor

Aarhus University
Department of Computer Science

E: peter.scholl@cs.au.dk

Ivan Bjerre Damgaard

Professor

Aarhus University
Department of Computer Science

Christian Igel

Professor

University of Copenhagen
Department of Computer Science

Kurt Nielsen

Associate Professor

University of Copenhagen
Department of Food and Resource Economics

Partners

Kategorier
SciTech project

Machine Learning Algorithms Generalisation

Project type: SCITECH Project

Machine Learning Algorithms Generalisation

AI is radically changing society and the main driver behind new AI methods and systems is machine learning. Machine learning focuses on finding solutions for, or patterns in, new data by learning from relevant existing data. Thus, machine learning algorithms are often applied to large datasets and then they more or less autonomously find good solutions by finding relevant information or patterns hidden in the data. However, it is often not well understood why machine learning algorithms work so well in practice on completely new data – often their performance surpass what current theory would suggest by a wide margin.

Being able to understand and predict when, why and how well machine learning algorithms work on a given problem is critical for knowing when they may be applied and trusted, in particular in more critical systems. Understanding why the algorithms work is also important in order to be able drive the machine learning field forward in the right direction, improving upon existing algorithms and designing new ones.

The goal of this project is to research and develop a better understanding of the generalisation capability of the most used machine learning algorithms, including boosting algorithms, support vector machines and deep learning algorithms. The result will be new generalisation bounds, both showing positive what can be achieved and negative what cannot.

This will allow us to more fully understand the current possibilities and limits, and thus drive the development of new and better methods. Ultimately, this will provide better guarantees for the quality of the output of machine learning algorithms in a variety of domains.

Researching the theoretical foundation for machine learning (and thus essentially all AI based systems) will benefit society at large, since a solid theory will allow us to formally argue and understand when and under which conditions machine learning algorithms can deliver the required quality.

As an added value, the project will bring together leading experts in Denmark in the theory of algorithms to (further) develop the fundamental theoretical basis of machine learning. Thus, it may serve as a starting point for additional national and international collaboration and projects, and it will build up competences highly relevant for Danish industry.

October 1, 2020 – September 31, 2024 – 3,5 years.

Total budget DKK 2,41 / DIREC investment DKK 1,55

Participants

Project Manager

Kasper Green Larsen

Associate Professor

Aarhus University
Department of Computer Science

E: larsen@cs.au.dk

Allan Grønlund

Postdoc

Aarhus University
Department of Computer Science

Mikkel Thorup

Professor

University of Copenhagen
Department of Computer Science

Martin Ritzert

Postdoc

Aarhus University
Department of Computer Science

Partners