Speaker: Taco Cohen
The principle of equivariance to symmetry transformations enables a theoretically grounded approach to neural network architecture design. Equivariant networks have shown excellent performance and data efficiency on vision and medical imaging problems that exhibit symmetries. Here we show how this principle can be extended beyond global symmetries to local gauge transformations. This enables the development of a very general class of convolutional neural networks on manifolds that depend only on the intrinsic geometry. This class includes and generalizes existing methods from equivariant- and geometric deep learning, and thus unifies these areas in a common gauge-theoretic framework.
We implement gauge equivariant CNNs for signals defined on the surface of the icosahedron, which provides a reasonable approximation of the sphere. By choosing to work with this very regular manifold, we are able to implement the gauge equivariant convolution using a single conv2d call, making it a highly scalable and practical alternative to Spherical CNNs. Using this method, we demonstrate substantial improvements over previous methods on the task of segmenting omnidirectional images and global climate patterns.
Speaker: Jascha Sohl-Dickstein
As neural networks become highly overparameterized, their accuracy improves, and their behavior becomes easier to analyze theoretically. I will give an introduction to a rapidly growing body of work which examines the learning dynamics and prior over functions induced by infinitely wide, randomly initialized, neural networks. Core results that I will discuss include: that the distribution over functions computed by a wide neural network often corresponds to a Gaussian process with a particular compositional kernel, both before and after training; that the predictions of wide neural networks are linear in their parameters throughout training; and that this perspective enables analytic predictions for how trainability depends on hyperparameters and architecture. These results provide for surprising capabilities—for instance, the evaluation of test set predictions which would come from an infinitely wide trained neural network without ever instantiating a neural network, or the rapid training of 10,000+ layer convolutional networks. I will argue that this growing understanding of neural networks in the limit of infinite width is foundational for future theoretical and practical understanding of deep learning.
Speaker: Mike Douglas
A fundamental theory of physics must explain how to derive the laws of physics ab initio, from purely formal constructions. In string/M theory there are general laws, such as general relativity and Yang-Mills theory, which are fixed by the theory. There are also specific details, such as the spectrum of elementary particles and the strengths of interactions between them. These are not determined uniquely, but are derived from the geometry of extra dimensions of space. This geometry must take one of a few special forms called special holonomy manifolds, for example a Calabi-Yau (CY) manifold, or a G2 manifold. These manifolds are of significant mathematical interest independent of their relevance to string theory, and mathematicians and physicists have been working together to classify them and work out the relations between them.
Such data —a set of objects and relations between them, defined by simple axioms — can be called a mathematical landscape. There are many important landscapes besides special holonomy manifolds — finite groups, bundles on manifolds, other classes of manifolds, etc. Many landscapes, such as that of six-dimensional CY manifolds, turn out to be combinatorially large. It is a further challenge to extract simple and useful pictures from the vast wealth of their data. Machine learning will be an essential tool to meet this challenge.
As an example, in the mid-90’s, the physicists Kreuzer and Skarke did a survey of the 6-d toric hypersurface CYs to study mirror symmetry. The plot which demonstrated this, revealed a `shield’ pattern which nobody had anticipated or even thought to ask about, whose explanation was attempted in several later works, and which may be the key to understanding the finite number of these spaces. Future studies of mathematical landscapes will no doubt reveal new unexpected patterns, especially if we have ML to help look for the patterns.
We will give a high level survey of work in this direction, trying to minimize mathematical and physical technicalities, and to raise new questions.
Speaker: Koji Hashimoto
Revealing quantum nature of matter is indispensable for finding new features of known exotic matter and even new materials, and drives the progress in theoretical physics. One of the most important is to construct explicitly ground state wave functions for given Hamiltonians. On the other hand, the holographic principle, discovered in the context of string theory, claims equivalence between quantum matter and classical higher-dimensional gravity. It provided a completely new viewpoint on the understanding of quantum matter, such as quarks in QCD and strongly correlated electrons.
These need to solve inverse problems, and machine learning should be effective intrinsically, with the following additionally important similarity in research: matter wave functions are given by tensor network optimization, and holographic principle defines quantum gravity where networks are regarded as discretized spacetime. Therefore, network optimization will play a central role in these sciences.
In this talk, I give a brief review of several important topics related to the networks, and provide a concrete example of applying deep neural network to the holographic principle. The higher-dimensional curved spacetime is discretized to a deep neural network, and the input data (quark correlators) optimizes the network. The neural network weights are regarded as the emergent spacetime metric, and some other physical observables predicted from the trained network geometry can well be compared with supercomputer simulations of QCD.
Speaker: Pratik Chaudhari
Deep networks are mysterious. These over-parametrized machine learning models, trained with rudimentary optimization algorithms on non-convex landscapes in millions of dimensions have defied attempts to put asound theoretical footing beneath their impressive performance.
This talk will shed light upon some of these mysteries. I will employ diverse ideas—from thermodynamics and optimal transportation to partial differential equations, control theory and Bayesian inference—and paint a picture of the training process of deep networks. Along the way, I will develop state-of-the-art algorithms for non-convex optimization.
The goal of machine perception is not just to classify objects in images but instead, enable intelligent agents that can seamlessly interact with our physical world. I will conclude with a vision of how advances in machine learning and robotics may come together to help build such an Embodied Intelligence.
Speaker: Liam McAllister
A foundational problem in string theory is to derive statistical predictions for observable phenomena in cosmology and in particle physics. I will give an accessible overview of this subject, assuming no prior knowledge of string theory.
I will explain that the set of possible natural laws is contained in the set of solutions of string theory that have no fields with zero mass and zero spin. Each solution is determined by a finite number of integers that specify the topology of the six-manifold on which the six extra dimensions of string theory are compactified. The total number of solutions is plausibly finite, albeit large. This set of solutions, the `landscape of string theory’, is the main object of study when relating string theory to observations.
I will discuss how the work of deriving predictions from the string landscape can be formulated as a computational problem. The integers specifying the six-manifold topology are the fundamental parameters in nature, and the task is to find out which values they can take, and what observables result for each choice. I will illustrate this problem in the case of six-manifolds that are defined by triangulations of certain four-dimensional lattice polytopes. Such manifolds are finite in number, though the number may exceed 10^900, and the computational tasks are almost entirely combinatorial. In this realm, one can aim to use machine learning to find patterns, or to pick out solutions with desirable properties. I will suggest some problems of this sort.