Probabilistic Programming

Module aims

To give the students tools and techniques to define computational generative models, conditioning on data, and inference problems using probabilistic programming systems (PPSs). A probabilistic programming system is in general composed by a probabilistic programming language (PPL) and a runtime performing the required code interpretation and inference tasks. The course will look at principles of probabilistic programming languages, including the main primitives for the specification of random variables as first-class types and the use of observation primitives for conditioning on external information, as well as the main inference methods available in common PPSs. It will also cover basic notions related to the semantics of probabilistic languages, the analysis of probabilistic programs, and applications to data analysis.

The course will provide an overview on domain-specific languages, specialized in the formulation of probabilistic, as well as on the extension of general-purpose languages with probabilistic primitives to allow embedding of probabilistic models within general applications.

More specifically students will:
- gain familiarity with the key probabilistic primitives commonly available in probabilistic programming languages and their use to formalize probabilistic models and inference problems, with particular emphasis on Bayesian inference
- study and specify the semantics of the main probabilistic programming constructs and the basic compilation and analysis methods it enables
- practice the use of PPLs to define classic modeling and inference problems related to computation, data analysis, and dynamic systems analysis
- appreciate the complexity, expressiveness, limitations, and applicability scope of different languages and inference methods supported by common PPSs
- learn how static analysis techniques and symbolic reasoning enable the analytical solution, simplification, and quality assurance of computational inference problems expressed in probabilistic programming languages

Learning outcomes

After the course, students should be able to:
1. design and implement simple probabilistic programming systems, including engineering a simple probabilistic language and related inference routines
2. write generative probabilistic models in a suitable probabilistic programming language
3. select appropriate analytical or statistical methods to solve common inference problems
4. compose computational probabilistic models and embed them within general applications

Module syllabus

  • Review of probability theory
  • Generative models, conditioning, and inference off-the-shelf
    • formulation of classic probabilistic models and data analysis tasks in common PPLs
  • Syntax, semantics, and inference for a simple probabilistic programming system
    • specification of a probabilistic programming language
    • implementation of basic inference methods based on program analysis and Monte Carlo simulation
    • symbolic reasoning and solution space quantification for inference
    • MCMC methods on simple programs
  • Symbolic and algebraic approaches to probabilistic reasoning
  • Different programming paradigms for PPSs
    • imperative, functional, OO, logic, higher-order languages and their capabilities and trade-offs with examples
    • computational optimizations, graphical model semantics, and execution on hybrid computational architectures
    • limitations of purely probabilistic generative models and deep-PPLs
  • Data analysis and reasoning about partial and sparse information
  • Advanced Monte Carlo inference methods for PPSs
    • including No-U-Turn, SMC, PMC, and variational analysis
  • Analysis and quality assurance for probabilistic programs and probabilistic programming systems

Teaching methods

7 weeks of 2 hours lectures and 2 hours tutorials (interleaved)

Each week: practical coding examples and recommended exercises
 

Assessments

Four coursework assignments spanning one week each.

CW1 and CW2 involve the extension of a simple PPS developed for the course with 1) additional language constructs and 2) additional inference methods. Simple probabilistic inference problems will be used to test the solutions and drive their development. These CWs are mainly related to ILOs 1 and 2.

CW3 and CW4 involve the implementation of probabilistic program on real PPSs to 1) analyze more complex problems and 2) perform data analysis tasks. For their implementations, the students will be required to select a language among some given options. These CWs are mainly related to ILOs 2-4.

-Extension of a simple probabilistic programming language
-Extension of a simple probabilistic programming system
-Modeling hierarchical/dynamical stochastic systems
-Advanced learning and data analysis

All coursework will be submitted as an archive containing the files needed to execute and test the solution. Submissions will be marked by the lecturer (and possibly TAs).

Module leaders

Dr Antonio Filieri