Project type: Explore Project
Sensitivity measures how much program outputs vary when changing inputs. We propose exploring novel methodologies for specifying and verifying sensitivity properties of probabilistic programs such that they (a) are comprehensible to everyday programmers, (b) can be verified using automated theorem provers, and (c) cover properties from the machine learning and security literature.
This work will bring together two junior researchers who recently arrived in Denmark and obtained their PhDs working on probabilistic verification.
Project description
Our overall objective is to explore how automated verification of sensitivity properties of probabilistic programs can support developers in increasing the trust in their software through formal assurances.
Probabilistic programs are programs with the ability to sample from probability distributions. Examples include randomized algorithms, where sampling is exploited to ensure that expensive executions have a low probability, cryptographic protocols, where randomness is essential for encoding secrets, and statistics, where programs are becoming a popular alternative to graphical models for describing complex distributions.
The sensitivity of a program determines how its outputs are affected by changes to its input; programs with low sensitivity are robust against fluctuations in their input – a key property for improving trust in software. Minor input changes should, for example, not affect the result of a classifier learned from training data. In the probabilistic setting, the output of a program depends not only on the input but also on the source of randomness. Hence, the notion of sensitivity – as well as techniques for reasoning about it – needs refinement.
Automated verification takes a deductive approach to proving that a program satisfies its specification: users annotate their programs with logical assertions; a verifier then generates verification conditions (VCs) whose validity implies that the program’s specification holds. Deductive verifiers are more complete and more scalable than fully automatic techniques but require significant user interaction. The main challenge for users of automated verifiers lies in finding suitable intermediate assertions, particularly loop invariants, such that an automated theorem prover can discharge the generated VCs. A significant challenge for developers of automated verifiers is to keep the amount and complexity of necessary annotations as low as possible.
Previous work [1] co-authored by the applicants provides a theoretical framework for reasoning about the sensitivity of probabilistic programs: the above paper presents a calculus to carry out “pen-and-paper” proofs of sensitivity in a principled and syntax-directed manner. The proposed technique deals with sampling instructions by requiring users to identify suitable probabilistic couplings, which act as synchronization points, on top of finding loop invariants. However, the technique is limited in the sense that it does not provide tight sensitivity bounds when changes to the input cause a program to take a different branch on a conditional.
Our project has four main goals. First, we will develop methodologies that do not suffer from the limitations of [1]. We believe that conditional branching can be treated by carefully tracking the possible divergence. Second, we will develop an automated verification tool for proving sensitivity properties of probabilistic programs. The tool will generate VCs based on the calculus from [1], which will be discharged using an SMT solver. In designing the specification language, we aim to achieve a balance so that (a) users can conveniently specify synchronization points for random samples (via so-called probabilistic couplings) and (b) existing solvers can prove the resulting VCs. Third, we aim to aid the verification process by assisting users in finding synchronization points. Invariant synthesis has been extensively studied in the case of deterministic programs. Similarly, coupling synthesis has been recently studied for the verification of probabilistic programs [2]. We believe these techniques can be adapted to the study of sensitivity. Finally, we will validate the overall verification system by applying it to case studies from machine learning, statistics, and randomized algorithms.
Aarhus University
Department of Computer Science
Technical University of Denmark
DTU Compute