DIREC-projekt

Explain me

- Learning to Collaborate via Explainable AI in Medical Education

Resumé

I den vestlige verden vurderes ca. hver tiende medicinske diagnose til at være forkert, hvilket resulterer i, at patienterne ikke får den rigtige behandling. Forklaringen kan være manglende erfaring og uddannelse hos lægepersonalet.

Sammen med klinikere har dette projekt til formål at udvikle forklarende AI, der kan hjælpe medicinsk personale med at træffe kvalificerede beslutninger ved at agere som en mentor, der giver feedback og råd, når personalet træner. Det er vigtigt, at den forklarlige AI giver gode forklaringer, som er lette at forstå og bruge under det medicinske personales arbejdsgang.

Projektperiode: 2021-2025
Budget: 28,44 millioner kr.

AI is widely deployed in assistive medical technologies, such as image-based diagnosis, to solve highly specific tasks with feasible model optimization. However, AI is rarely designed as a collaborator for the healthcare professionals, but rather as a mechanical substitute for part of a diagnostic workflow. From the AI researcher’s point of view, the goal of development is to beat state-of-the-art on narrow performance parameters, which the AI may solve with superhuman accuracy.

However, for more general problems such as full diagnosis, treatment execution, or explaining the background for a diagnosis, the AI is still not to be trusted. Hence, clinicians do not always perceive AI solutions as helpful in solving their clinical tasks, as they only solve part of the problem sufficiently well. The EXPLAIN-ME initiative seeks to create AIs that help solve the overall general tasks in collaboration with the human health care professional.

To do so, we need not only to provide interpretability in the form of explainable AI models — we need to provide models whose explanations are easy to understand and utilize during the clinician’s workflow. Put simply, we need to provide good explanations.

Unmet technical needs
It is not hard to agree that good explanations are better than bad explanations. In this project, however, we aim to establish methods and collect data that allow us to train and validate the quality of clinical AI explanations in terms of how understandable and useful they are.

AI support should neither distract nor hinder ongoing tasks, giving fluctuating need for AI support, e.g. throughout a surgical procedure. As such, the relevance and utility of AI explanations are highly context- and task-dependent. Through collaboration with Zealand University Hospital we will develop explainable AI (XAI) feedback for human-AI collaboration in static clinical procedures, where data is collected and analyzed independently — e.g. when diagnosing cancer from scans collected beforehand in a different unit.

In collaboration with CAMES and NordSim, we will implement human-AI collaboration in simulation centers used to train clinicians in dynamic clinical procedures, where data is collected on the fly — e.g. for ultrasound scanning of pregnant women, or robotic surgery. We will monitor the clinicians’ behavior and performance as a function of feedback provided by the AI. As there are no actual patients involved in medical simulation, we are also free to provide clinicians with potentially bad explanations, and we may use the clinicians’ responses to freely train and evaluate the AI’s ability to explain.

Unmet clinical needs
In the Western World, medical errors are only exceeded by cancer and heart diseases in the number of fatalities caused. About one in ten diagnoses is estimated to be wrong, resulting in inadequate and even harmful care. Errors occur during clinical practice for several reasons, but most importantly, because clinicians often work alone with minimal expert supervision and support. The EXPLAIN-ME initiative aims to create AI decision support systems that take the role of an experienced mentor providing advice and feedback.

This initiative seeks to optimize the utility of feedback provided by healthcare explainable AI (XAI). We will approach this problem both in static healthcare applications, where clinical decisions are based on data already collected, and in dynamic applications, where data is collected on the fly to continually improve confidence in the clinical decision. Via an interdisciplinary effort between XAI, medical simulation, participatory design and HCI, we aim to optimize the explanations provided by the XAI to be of maximal utility for clinicians, supporting technology utility and acceptance in the clinic.

Case 1: Renal tumor classification
Classification of a renal tumor as malign or benign is an example of a decision that needs to be taken under time pressure. If malign, the patient should be operated immediately to prevent cancer from spreading to the rest of the body, and thus a false positive diagnosis may lead to the unnecessary destruction of a kidney and other complications. While AI methods can be shown statistically to be more precise than an expert physician, there is a need for extending it with explanation for a decision– and only the physicians know what “a good explanation” is. This motivates a collaborative design and development process to find the best balance between what is technically possible and what is clinically needed.

Case 2: Ultrasound Screening
Even before birth, patients suffer from erroneous decisions made by healthcare workers. In Denmark, 95% of all pregnant women participate in the national ultrasound screening program aimed at detecting severe maternal-fetal disease. Correct diagnosis is directly linked to the skills of the clinicians, and only about half of all serious conditions are detected before birth. AI feedback, therefore, comes with the potential to standardize care across clinicians and hospitals. At DTU, KU and CAMES, ultrasound imaging will be the main case for development, as data access and management, as well as manual annotations, are already in place. We seek to give the clinician feedback during scanning, such as whether the current image is a standard ultrasound plane (see figure); whether it has sufficient quality; whether the image can be used to predict clinical outcomes, or how to move the probe to improve image quality.

Case 3: Robotic Surgery
AAU and NordSim will collaborate on the assessment and development of robotic surgeons’ skills, associated with an existing clinical PhD project. Robotic surgery allows surgeons to do their work with more precision and control than traditional surgical tools, thereby reducing errors and increasing efficiency. AI-based decision support is expected to have a further positive effect on outcomes. The usability of AI decision support is critical, and this project will study temporal aspects of the human-AI collaboration, such as how to present AI suggestions in a timely manner without interrupting the clinician; how to hand over tasks between a member of the medical team and an AI system; and how to handle disagreement between the medical expert and the AI system.

In current healthcare AI research and development, there is often a gap between the needs of clinicians and the developed solutions. This comes with a lost opportunity for added value: We miss out on potential clinical value for creating standardized, high quality care across demographic groups. Just as importantly, we miss out on added business value: If the first, research-based step in the development food chain is unsuccessful, then there will also be fewer spin-offs and start-ups, less knowledge dissemination to industry, and overall less innovation in healthcare AI.

The EXPLAIN-ME initiative will address this problem:

  • We will improve clinical interpretability of healthcare AI by developing XAI methods and workflows that allow us to optimize XAI feedback for clinical utility, measured both on clinical performance and clinical outcomes.
  • We will improve clinical technology acceptance by introducing these XAI models in clinical training via simulation-laboratories.
  • We will improve business value by creating a prototype for collaborative, simulation-based deployment of healthcare AI. This comes with great potential for speeding up industrial development of healthcare AI: Simulation-based testing of algorithms can begin while algorithms still make mistakes, because there is no risk of harming patients. This, in particular, can speed up the timeline from idea to clinical implementation, as the simulation-based testing is realistic while not requiring the usual ethical approvals.

This comes with great potential value: While AI has transformed many aspects of society, its impact on the healthcare sector is so far limited. Diagnostic AI is a key topic in healthcare research, but only marginally deployed in clinical care. This is partly explained by the low interpretability of state-of-the-art AI, which negatively affects both patient safety and clinicians’ technology acceptance. This is also explained by the typical workflow in healthcare AI research and development, which is often structured as parallel tracks where AI researchers independently develop technical solutions to a predefined clinical problem, while only occasionally interacting with the clinical end-users.

This often results in a gap between the clinicians’ needs and the developed solution. The EXPLAIN-ME initiative aims to close this gap by developing AI solutions that are designed to interact with clinicians in every step of the design-, training-, and implementation process.

Værdi

Projektet vil udvikle forklarlig AI, der kan hjælpe medicinsk personale med at træffe kvalificerede beslutninger ved at tage rollen som mentor.

Nyheder / omtale

Deltagere

Project Manager

Aasa Feragen

Aasa Feragen

Professor

Technical University of Denmark
DTU Compute

E: afhar@dtu.dk

Anders Nymark Christensen

Anders Nymark Christensen

Asscociate Professor

Technical University of Denmark
DTU Compute

Mads Nielsen

Mads Nielsen

Professor

University of Copenhagen
Department of Computer Science

Mikael B. Skov

Mikael B. Skov

Professor

Aalborg University
Department of Computer Science

Niels van Berkel

Niels van Berkel

Associate Professor

Aalborg University
Department of Computer Science

Henning Christiansen

Henning Christiansen

Professor

Roskilde University
Department of People and Technology

Jesper Simonsen

Jesper Simonsen

Professor

Roskilde University
Department of People and Technology

Henrik Bulskov Styltsvig

Henrik Bulskov Styltsvig

Associate Professor

Roskilde University
Department of People and Technology

Martin Tolsgaard

Martin Tolsgaard

Associate Professor

CAMES Rigshopitalet

Morten Bo Svendsen

Morten Bo Svendsen

Chief Engineer

CAMES Rigshospitalet

Sten Rasmussen

Sten Rasmussen

Professor, Head

Dept. of Clinical Medicine
Aalborg University

Mikkel Lønborg Friis

Mikkel Lønborg Friis

Director

NordSim
Aalborg University

Nessn Htum Azawi

Nessn Htum Azawi

Associate Professor,
Head of Research Unit & Renal Cancer team

Department of Urology
Zealand University Hospital

Manxi Lin

Manxi Lin

PhD Student

University of Southern Denmark
DTU Compute

Naja Kathrine Kollerup

Naja Kathrine Kollerup

PhD Student

Aalborg University
Department of Computer Science

Jakob Ambsdorf

Jakob Ambsdorf

PhD Student

University of Copenhagen
Department of Computer Science

Daniel van Dijk Jacobsen

Daniel van Dijk Jacobsen

PhD Student

Roskilde University
Department of People and Technology

Partnere