Project type: Bridge Project

Embedded AI

AI is currently limited by the need for massive data centres and centralized architectures, as well as the need to move this data to algorithms. To overcome this key limitation, AI will evolve from today’s highly structured, controlled, and centralized architecture to a more flexible, adaptive, and distributed network of devices. This transformation will bring algorithms to the data, made possible by algorithmic agility and autonomous data discovery, and it will drastically reduce the need for high-bandwidth connectivity, which is required to transport massive data sets, and eliminate any potential sacrifice of the data’s security and privacy. Furthermore, it will eventually allow true real-time learning at the edge.

This transformation is enabled by the merging of AI and IoT into “Artificial Intelligence of Things” (AIoT), and has created an emerging sector of Embedded AI (eAI), where all or parts of the AI processing is done on the sensor devices at the edge, rather than sent to the cloud. The major drivers for Embedded AI are, increased responsiveness and functionality, reduced data transfer, and increased resilience, security, and privacy. To deliver these benefits, development engineers need to acquire new skills in embedded development and systems design.

To enter and compete in the AI era, companies are hiring data scientists to build expertise in AI and create value from data. This is true for many companies developing embedded systems, for instance to control water, heat and air flow in large facilities, large ship engines or industrial robots, all with the aim to optimize their products and services.
However, there is a challenging gap between programming AI in the cloud using tools like Tensorflow, and programming at the edge, where resources are extremely constrained. This project will develop methods and tools to migrate AI algorithms from the cloud to a distributed network of AI enabled edge-devices. The methods will be demonstrated on several use-cases from the industrial partners.

In a traditional, centralized AI architecture, all the technology blocks would be combined in the cloud or at a single cluster (Edge computing) to enable AI. Data collected by IoT, i.e., individual edge-devices, will be sent towards the cloud. To limit the amount of data needed to be sent, data aggregation may be performed along the way to the cloud. The AI stack, the training, and the later inference, will be performed in the cloud, and results for actions will be transferred back to the relevant edge-devices. While the cloud provides complex AI algorithms which can analyse huge datasets fast and efficiently, it cannot deliver true real-time response and data security and privacy may be challenged.

When it comes to Embedded AI, where AI algorithms are moved to the edge, there is a need to transform the foundation of the AI Stack by enabling transformational advances, algorithmic agility and distributed processing will enable AI to perceive and learn in real-time by mirroring critical AI functions across multiple disparate systems, platforms, sensors, and devices operating at the edge. We propose to address these challenges in the following steps, starting with single edge-devices.

  1. Tiny inference engines – Algorithmic agility of the inference engines will require new AI algorithms and new processing architectures and connectivity. We will explore suitable microcontroller architectures and reconfigurable platform technologies, such as Microchip’s low power FPGA’s, for implementing optimized inference engines. Focus will be on achieving real-time performance and robustness. This will be tested on cases from the industry partners.
  2. µBrains – Extending the edge-devices from pure inference engines to also provide local learning. This will allow local devices to provide continuous improvements. We will explore suitable reconfigurable platform technologies with ultra-low power consumption, such as Renesas’ DRP’s using 1/10 of the power budget of current solutions, and Microchip’s low power FPGA’s for optimizing neural networks. Focus will be on ensuring the performance, scheduling, and resource allocation of the new AI algorithms running on very resource constrained edge-devices.
  3. Collective intelligence – The full potential of Embedded AI will require distributed algorithmic processing of the AI algorithms. This will be based on federated learning and computing (microelectronics) optimized for neural networks, but new models of distributed systems and stochastic analysis, is necessary to ensure the performance,
    prioritization, scheduling, resource allocation, and security of the new AI algorithms—especially with the very dynamic and opportunistic communications associated with IoT.

The expected outcome is an AI framework which supports autonomous discovery and processing of disparate data from a distributed collection of AI-enabled edge-devices. All three presented steps will be tested on cases from the industry partners.


Deep neural networks have changed the capabilities of machine learning reaching higher accuracy than hitherto. They are in all cases on learning from unstructured data now the de facto standard. These networks often include millions of parameters and may take months to train on dedicated hardware in terms of GPUs in the cloud. This has resulted in high demand of data scientists with AI skills and hence, an increased demand for educating such profiles. However, an increased use of IoT to collect data at the edge, have created a wish for training and executing deep neural networks at the edge rather than transferring all data to the cloud for processing. As IoT end- or edge-devices are characterized by low memory, low processing power, and low energy (powered by battery or energy harvesting), training or executing deep neural networks is considered infeasible. However, developing dedicated accelerators, novel hardware circuits and architectures, or executing smaller discretized networks may provide feasible solutions for the edge.

The academic partners DTU, KU, AU and CBS, will not only create the scientific value from the results disseminated through the four PhDs, but will also create important knowledge, experience, and real-life cases to be included in the education, and hence, create capacity building in this important merging field of embedded AI or AIoT.

The industry partners Indesmatech, Grundfos, MAN ES, and VELUX are all strong examples of companies who will benefit from mastering embedded AI, i.e., being able to select the right tools and execution platforms for implementing and deploying embedded AI in their products.

  • Indesmatech expect to gain leading edge knowledge about how AI can be implemented on various chip processing platforms, with a focus on finding the best and most efficient path to build cost and performance effective industrial solutions across industries as their customers are represented from most industries.
  • Grundfos will create value in applications like condition monitoring of pump and pump systems, predictive maintenance, heat energy optimization in buildings and waste-water treatment where very complex tasks can be optimized significant by AI. The possibility to deploy embedded AI directly on low cost and low power End and Edge devices instead of large cloud platforms, will give Grundfos a significant competitive advantage by reducing total energy consumption, data traffic, product cost, while at the same time increase real time application performance and secure customers data privacy.
  • MAN ES will create value from using embedded AI to predict problems faster than today. Features such as condition monitoring and dynamic engine optimization will give MAN ES competitive advantages, and
    the exploitation of embedded AI together with their large amount of data collected in the cloud will in the long run create marked advantages for MAN ES.
  • VELUX will increase their competitive edge by attaining a better understanding of the ability to implement the right level of embedded AI into their products. The design of new digital smart products with embedded intelligence, will create value from driving the digital product transformation of VELUX.

January 1, 2022 – December 31, 2024 – 3 years.

Total budget DKK 22,5 million / DIREC investment DKK 6,54 million.


Project Manager

Jan Madsen


Technical University of Denmark
DTU Compute


Peter Gorm Larsen


Aarhus University
Dept. of Electrical and Computer Engineering

Mads Nielsen


University of Copenhagen
Department of Computer Science

Jan Damsgaard


Copenhagen Business School
Department of Digitalization

Thorkild Kvisgaard

Head of Electronics


Henrik R. Olesen

Senior Manager

MAN Energy Solutions

Thomas S. Toftegaard

Director, Smart Product Technology


Rune Domsten

Co-founder & CEO