The goal of Physics ∩ ML (read ‘Physics Meets ML’) is to bring together researchers from machine learning and physics to learn from each other and push research forward together. In this inaugural edition, we will especially highlight some amazing progress made in string theory with machine learning and in the understanding of deep learning from a physical angle. Nevertheless, we invite a cast with wide ranging expertise in order to spark new ideas. Plenary sessions from experts in each field and shorter specialized talks will introduce existing research. We will hold moderated discussions and breakout groups in which participants can identify problems and hopefully begin new collaborations in both directions. For example, physical insights can motivate advanced algorithms in machine learning, and analysis of geometric and topological datasets with machine learning can yield critical new insights in fundamental physics.
Greg Yang, Microsoft Research
Jim Halverson, Northeastern University
Sven Krippendorf, LMU Munich
Fabian Ruehle, CERN, Oxford University
Rak-Kyeong Seong, Samsung SDS
Gary Shiu, University of Wisconsin
Day 1 | Thursday, April 25
|Session 1||Plenary talks|
|8:00 AM–9:00 AM||Breakfast|
|9:00 AM–9:45 AM||Gauge equivariant convolutional networks||Taco Cohen|
|9:45 AM–10:30 AM||Understanding overparameterized neural networks||Jascha Sohl-Dickstein|
|10:30 AM–11:00 AM||Break|
|11:00 AM–11:45 AM||Mathematical landscapes and string theory||Mike Douglas|
|11:45 AM–12:30 PM||Holography, matter and deep learning||Koji Hashimoto|
|12:30 PM–2:00 PM||Lunch|
|2:00 PM–4:05 PM||Session 2||Applying physical insights to ML|
|2:00 PM–2:45 PM||Plenary: A picture of the energy landscape of deep neural networks||Pratik Chaudhari|
|2:45 PM–4:05 PM||Short talks|
|Neural tangent kernel and the dynamics of large neural nets||Clement Hongler|
|On the global convergence of gradient descent for over-parameterized models using optimal transport||Lénaïc Chizat|
|Pathological spectrum of the Fisher information matrix in deep neural networks||Ryo Karakida|
|Fluctuation-dissipation relation for stochastic gradient descent||Sho Yaida|
|From optimization algorithms to continuous dynamical systems and back||Rene Vidal|
|The effect of network width on stochastic gradient descent and generalization||Daniel Park|
|Short certificates for symmetric graph density inequalities||Rekha Thomas|
|Geometric representation learning in hyperbolic space||Maximilian Nickel|
|The fundamental equations of MNIST||Cedric Beny|
|Quantum states and Lyapunov functions reshape universal grammar||Paul Smolensky|
|Multi-scale deep generative networks for Bayesian inverse problems||Pengchuan Zhang|
|Variational quantum classifiers in the context of quantum machine learning||Alex Bocharov|
|4:05 PM–4:30 PM||Break|
|4:30 PM–5:30 PM||The intersect ∩|
Day 2 | Friday, April 26
|Session 3||Applying ML to physics|
|8:00 AM–9:00 AM||Breakfast|
|9:00 AM–9:45 AM||Plenary: Combinatorial Cosmology||Liam McAllister|
|9:45 AM–10:15 AM||Break|
|10:15 AM–11:35 AM||Short talks|
|Bypassing expensive steps in computational geometry||Yang-Hui He|
|Learning string theory at Large N||Cody Long|
|Training machines to extrapolate reliably over astronomical scales||Brent Nelson|
|Breaking the tunnel vision with ML||Sergei Gukov|
|Can machine learning give us new theoretical insights in physics and math?||Washington Taylor|
|Brief overview of machine learning holography||Yi-Zhuang You|
|Applications of persistent homology to physics||Alex Cole|
|Seeking a connection between the string landscape and particle physics||Patrick Vaudrevange|
|PBs^-1 to science: novel approaches on real-time processing from LHCb at CERN||Themis Bowcock|
|From non-parametric to parametric: manifold coordinates with physical meaning||Marina Meila|
|Machine learning in quantum many-body physics: A blitz||Yichen Huang|
|Knot Machine Learning||Vishnu Jejjala|
|11:35 AM–12:30 PM||Panel discussion with panelists Michael Freedman, Clement Hongler, Gary Shiu, Paul Smolensky, Washington Taylor|
|12:30 PM–1:30 PM||Lunch|
|Session 4||Breakout groups|
|1:30 PM–3:00 PM||Physics breakout groups|
|Symmetries and their realisations in string theory||Sergei Gukov, Yang-Hui He|
|String landscape||Michael Douglas, Liam McAllister|
|Connections of holography and ML||Koji Hashimoto, Yi-Zhuang You|
|3:00 PM–4:30 PM||ML breakout groups|
|Geometric representations in deep learning||Maximilian Nickel|
|Understanding deep learning||Yasaman Bahri, Boris Hanin, Jaehoon Lee|
|Physics and optimization||Rene Vidal|
Gauge equivariant convolutional networks
Speaker: Taco Cohen
The principle of equivariance to symmetry transformations enables a theoretically grounded approach to neural network architecture design. Equivariant networks have shown excellent performance and data efficiency on vision and medical imaging problems that exhibit symmetries. Here we show how this principle can be extended beyond global symmetries to local gauge transformations. This enables the development of a very general class of convolutional neural networks on manifolds that depend only on the intrinsic geometry. This class includes and generalizes existing methods from equivariant- and geometric deep learning, and thus unifies these areas in a common gauge-theoretic framework.
We implement gauge equivariant CNNs for signals defined on the surface of the icosahedron, which provides a reasonable approximation of the sphere. By choosing to work with this very regular manifold, we are able to implement the gauge equivariant convolution using a single conv2d call, making it a highly scalable and practical alternative to Spherical CNNs. Using this method, we demonstrate substantial improvements over previous methods on the task of segmenting omnidirectional images and global climate patterns.
Understanding overparameterized neural networks
Speaker: Jascha Sohl-Dickstein
As neural networks become highly overparameterized, their accuracy improves, and their behavior becomes easier to analyze theoretically. I will give an introduction to a rapidly growing body of work which examines the learning dynamics and prior over functions induced by infinitely wide, randomly initialized, neural networks. Core results that I will discuss include: that the distribution over functions computed by a wide neural network often corresponds to a Gaussian process with a particular compositional kernel, both before and after training; that the predictions of wide neural networks are linear in their parameters throughout training; and that this perspective enables analytic predictions for how trainability depends on hyperparameters and architecture. These results provide for surprising capabilities—for instance, the evaluation of test set predictions which would come from an infinitely wide trained neural network without ever instantiating a neural network, or the rapid training of 10,000+ layer convolutional networks. I will argue that this growing understanding of neural networks in the limit of infinite width is foundational for future theoretical and practical understanding of deep learning.
Mathematical landscapes and string theory
Speaker: Mike Douglas
A fundamental theory of physics must explain how to derive the laws of physics ab initio, from purely formal constructions. In string/M theory there are general laws, such as general relativity and Yang-Mills theory, which are fixed by the theory. There are also specific details, such as the spectrum of elementary particles and the strengths of interactions between them. These are not determined uniquely, but are derived from the geometry of extra dimensions of space. This geometry must take one of a few special forms called special holonomy manifolds, for example a Calabi-Yau (CY) manifold, or a G2 manifold. These manifolds are of significant mathematical interest independent of their relevance to string theory, and mathematicians and physicists have been working together to classify them and work out the relations between them.
Such data —a set of objects and relations between them, defined by simple axioms — can be called a mathematical landscape. There are many important landscapes besides special holonomy manifolds — finite groups, bundles on manifolds, other classes of manifolds, etc. Many landscapes, such as that of six-dimensional CY manifolds, turn out to be combinatorially large. It is a further challenge to extract simple and useful pictures from the vast wealth of their data. Machine learning will be an essential tool to meet this challenge.
As an example, in the mid-90’s, the physicists Kreuzer and Skarke did a survey of the 6-d toric hypersurface CYs to study mirror symmetry. The plot which demonstrated this, revealed a `shield’ pattern which nobody had anticipated or even thought to ask about, whose explanation was attempted in several later works, and which may be the key to understanding the finite number of these spaces. Future studies of mathematical landscapes will no doubt reveal new unexpected patterns, especially if we have ML to help look for the patterns.
We will give a high level survey of work in this direction, trying to minimize mathematical and physical technicalities, and to raise new questions.
Holography, matter and deep learning
Speaker: Koji Hashimoto
Revealing quantum nature of matter is indispensable for finding new features of known exotic matter and even new materials, and drives the progress in theoretical physics. One of the most important is to construct explicitly ground state wave functions for given Hamiltonians. On the other hand, the holographic principle, discovered in the context of string theory, claims equivalence between quantum matter and classical higher-dimensional gravity. It provided a completely new viewpoint on the understanding of quantum matter, such as quarks in QCD and strongly correlated electrons.
These need to solve inverse problems, and machine learning should be effective intrinsically, with the following additionally important similarity in research: matter wave functions are given by tensor network optimization, and holographic principle defines quantum gravity where networks are regarded as discretized spacetime. Therefore, network optimization will play a central role in these sciences.
In this talk, I give a brief review of several important topics related to the networks, and provide a concrete example of applying deep neural network to the holographic principle. The higher-dimensional curved spacetime is discretized to a deep neural network, and the input data (quark correlators) optimizes the network. The neural network weights are regarded as the emergent spacetime metric, and some other physical observables predicted from the trained network geometry can well be compared with supercomputer simulations of QCD.
A picture of the energy landscape of deep neural networks
Speaker: Pratik Chaudhari
Deep networks are mysterious. These over-parametrized machine learning models, trained with rudimentary optimization algorithms on non-convex landscapes in millions of dimensions have defied attempts to put asound theoretical footing beneath their impressive performance.
This talk will shed light upon some of these mysteries. I will employ diverse ideas—from thermodynamics and optimal transportation to partial differential equations, control theory and Bayesian inference—and paint a picture of the training process of deep networks. Along the way, I will develop state-of-the-art algorithms for non-convex optimization.
The goal of machine perception is not just to classify objects in images but instead, enable intelligent agents that can seamlessly interact with our physical world. I will conclude with a vision of how advances in machine learning and robotics may come together to help build such an Embodied Intelligence.
Speaker: Liam McAllister
A foundational problem in string theory is to derive statistical predictions for observable phenomena in cosmology and in particle physics. I will give an accessible overview of this subject, assuming no prior knowledge of string theory.
I will explain that the set of possible natural laws is contained in the set of solutions of string theory that have no fields with zero mass and zero spin. Each solution is determined by a finite number of integers that specify the topology of the six-manifold on which the six extra dimensions of string theory are compactified. The total number of solutions is plausibly finite, albeit large. This set of solutions, the `landscape of string theory’, is the main object of study when relating string theory to observations.
I will discuss how the work of deriving predictions from the string landscape can be formulated as a computational problem. The integers specifying the six-manifold topology are the fundamental parameters in nature, and the task is to find out which values they can take, and what observables result for each choice. I will illustrate this problem in the case of six-manifolds that are defined by triangulations of certain four-dimensional lattice polytopes. Such manifolds are finite in number, though the number may exceed 10^900, and the computational tasks are almost entirely combinatorial. In this realm, one can aim to use machine learning to find patterns, or to pick out solutions with desirable properties. I will suggest some problems of this sort.