UC Berkeley

Title: Model Selection And Ensembling When There Are More Parameters Than Data

Abstract: Despite years of empirical success with deep learning for many large-scale problems, existing theoretical frameworks fail to explain many of the most successful heuristics used by practitioners.  The primary weakness most approaches encounter is a reliance on the typical large data regime, which neural networks often do not operate in due to their large size.  To overcome this issue, I will describe how for any overparameterized (high-dimensional) model, there exists a dual underparameterized (low-dimensional) model that possesses the same marginal likelihood, establishing a form of Bayesian duality.  Applying classical methods to this dual model reveals the Interpolating Information Criterion, a measure of model quality that is consistent with current deep learning heuristics.  I will also describe how, in many modern machine learning settings, the benefits of ensembling are less ubiquitous and less obvious than classically.  Theoretically, we prove simple new results relating the ensemble improvement rate (a measure of how much ensembling decreases the error rate versus a single model, on a relative scale) to the disagreement-error ratio.  Empirically, the predictions made by our theory hold, and we identify practical scenarios where ensembling does and does not result in large performance improvements.  Perhaps most notably, we demonstrate a distinct difference in behavior between interpolating models (popular in current practice) and non-interpolating models (such as tree-based methods, where ensembling is popular), demonstrating that ensembling helps considerably more in the latter case than in the former.


University College London (UCL)

Title: Variational Bayesian inference for structure learning

Abstract: I will introduce recent advances in model inversion and model selection, which use a suite of statistical methods called Variational Bayes. These methods originated in statistical physics and have since been developed in neuroimaging and machine learning. Bayesian methods provide a principled approach for identifying the structure of dynamical systems, in terms of finding a set of model parameters that can explain observed data. I will use examples from neuroimaging, where the objective is to resolve a particularly difficult ill-posed problem – how can we identify the parameters of dynamical systems, which can explain the activity of biological neural networks, which in turn give rise to measurements from brain recordings? The methods I will introduce are accompanied by freely available source code, enabling translation to other disciplines in science and engineering.


The workshop format will be as usual: arrival on Sunday afternoon (September 29th), dinner will be served. 

Workshop ends on Wednesday October 2nd after lunch.


To be announced.


We recommend A0 format for the posters.

Authors will also present a “poster teaser” of about one minute right before the poster session as indicated in the program.

To this purpose authors should send ONE slide in pdf format to by September 20th.

SOCIAL PROGRAM (Tuesday October 1st, afternoon)

The social program will include a guided visit to Murano island and a glass furnace, concluding with the social dinner in Murano.