BIAS 22 – Review Day 1 – Dr James Cussens: ‘Algorithms for learning Bayesian networks’

BIAS 22 DAY 1, TALK 1

This blog post is written by CDT Students Roussel Desmond Nzoyem, Davide Turco and Mauro Comi

This Tuesday 06th September 2022 marked the start of the second edition of the Bristol Interactive AI Summer School (BIAS): a unique blend of events (talks, workshops, etc.) focusing on machine learning and other forms of AI explored in the Interactive AI CDT.

Following the tradition, BIAS22 began with a few words of introduction from the CDT director, Professor Peter Flach. He welcomed and warmly thanked the roomful of attendees from academia and industry.

Prof Flach proceeded with a passionate presentation of the range of speakers while giving the audience a brief taste of what to expect during the 3-day long event: talks, workshops, along with a barbecue! He remarked on the variety of Interactive AI ingredients that would be touched: data-driven AI, knowledge-driven AI, human-AI interaction, and Responsible AI.

Prof Flach’s introduction ended with an acknowledgement of the organisers of the event.

Dr James Cussens: ‘Algorithms for learning Bayesian networks’

The first talk of the day was an introduction to Bayesian networks and methods to learn them, given by our very own James Cussens.

Bayesian networks (BN) are directed acyclic graphs, in which each node represents a random variable. The important aspects of these networks, as Dr Cussens highlighted, is that they both define probabilistic distributions and causality relationships: this makes Bayesian networks a popular tool in complex fields such as epidemiology and medical sciences.

Learning BNs is a form of unsupervised learning, based on the assumption that the available data (real or simulated) is generated by an underlying BN. There are multiple reasons for learning a BN from data, such as learning a data-generating probability distribution or learning conditional independence relationships between variables; the talk, however, focused on learning a BN in order to estimate a causal model of the data, which is a task not easy to complete with other machine learning approaches we study and use in the CDT.

A popular algorithm for learning the structure of a BN, the so-called DAG, is constraint-based learning: the basic idea behind this algorithm is to perform statistical tests on data and find a DAG which is consistent with the outcomes of the tests. However, this approach presents some issues: for example, different DAGs could encode the same set of conditional independence relationships.

Dr Cussens then proceeded to introduce DAGgity, a widely used software for creating DAGs and analysing their causal structure. It is important to note that DAGgity does not learn DAGs from data, but allows the researcher to perform interventions and graph surgery. For example, it could allow a clinician to infer a treatment-response causal effects without doing that in practice. The talk also included a small excursus on score-based learning of BNs, which is a Bayesian approach to learning these networks, I.e., it has a prior formulation.

There are many different methods for learning BNs and evaluation is key for choosing the best method. Dr Cussens introduced benchpress, a framework for performing method evaluation and comparison, and showed some results from the benchpress paper, including the evaluation of his own method, GOBNILP (Global Optimum Bayesian Network via Inductive Logic Programming).

We are thankful to James Cussens for opening the BIAS22 with his talk; it was great to get an introduction to these methods that put together many aspects of our CDT, such as causality and graphical models.

Leave a Reply Cancel reply