Asst. Professor
School of Computer Science
Tel Aviv University
Implicit Regularization in Quantum Tensor Networks
The mysterious ability of neural networks to generalize is believed to stem from an implicit regularization, a tendency of gradient-based optimization to fit training data with predictors of low “complexity.” Despite vast efforts, a satisfying formalization of this intuition is lacking. In this talk I will present a series of works theoretically analyzing the implicit regularization in quantum tensor networks, known to be equivalent to certain (non-linear) neural networks. Through dynamical characterizations, I will establish an implicit regularization towards low tensor ranks, different from any type of norm minimization, in contrast to prior beliefs. I will then discuss implications of this finding to both theory (potential explanation for generalization over natural data) and practice (compression of neural network layers, novel regularization schemes). An underlying theme of the talk will be the potential of quantum tensor networks to unravel mysteries behind deep learning. Works covered in the talk were in collaboration with Sanjeev Arora, Wei Hu, Yuping Luo, Asaf Maman and Noam Razin.
Works covered in the talk were in collaboration with Yoav Levine, Or Sharir, Ronen Tamari, David Yakira and Nadav Cohen.
Nadav Cohen is an Asst. Professor of Computer Science at Tel Aviv University. His research focuses on the theoretical and algorithmic foundations of deep learning. In particular, he is interested in mathematically analyzing aspects of expressiveness, optimization and generalization, with the goal of deriving theoretically founded procedures and algorithms that will improve practical performance. Nadav earned a BSc in electrical engineering and a BSc in mathematics (both summa cum laude) at the Technion Excellence Program for Distinguished Undergraduates. He obtained his PhD (direct track) at the School of Computer Science and Engineering in the Hebrew University of Jerusalem, after which he was a postdoctoral research scholar at the School of Mathematics in the Institute for Advanced Study of Princeton. For his contributions to the theoretical understanding of deep learning, Nadav received a number of awards, including the Google Doctoral Fellowship in Machine Learning, the Rothschild Postdoctoral Fellowship, the Zuckerman Postdoctoral Fellowship, and the Google Research Scholar Award.
Bren Professor at CalTech
director of machine learning research at NVIDIA
(live presentation)
(live presentation)
Anima Anandkumar is a Bren Professor at Caltech and Director of ML Research at NVIDIA. She was previously a Principal Scientist at Amazon Web Services. She has received several honors, such as Alfred. P. Sloan Fellowship, NSF Career Award, Young investigator awards from DoD, and Faculty Fellowships from Microsoft, Google, Facebook, and Adobe. She is part of the World Economic Forum's Expert Network. She is passionate about designing principled AI algorithms and applying them in interdisciplinary applications. Her research focus is on unsupervised AI, optimization, and tensor methods.
Director of the Network Intelligence
and Distributed Systems Research Group
(Nokia-Bell Labs)
High Performance Computation for Tensor Networks Learning
In this talk, we study high performance computation for tensor networks to address time and space complexities that grow rapidly with the tensor size. We propose efficient primitives that exploit parallelism in tensor learning for efficient implementation on GPU.
Anwar Walid is Director of Network Intelligence and Distributed Systems Research, and a Distinguished Member of Research at Bell Labs (Murray Hill, N.J.). He also served at Bell Labs as Head of the Mathematics of Systems Research Department, and Director of Global University Research Partnerships. He received Ph.D. from Columbia University, and B.S. and M.S. from New York University. He has over 20 US and international granted patents on various aspects of computing, communications and networking. He received awards from the ACM and IEEE, including the 2019 ACM SIGCOMM Networking Systems Award for “development of a networking system that has had a significant impact on the world of computer networking” and the 2017 IEEE Communications Society William R. Bennett Prize. Dr. Walid has served on various editorial boards including IEEE IoT Journal – 2019 Special Issue on AI-Enabled Cognitive Communications and Networking for IoT, and IEEE Transactions on Network Science. He served as General Co-Chair of 2018 IEEE/ACM Conference on Connected Health (CHASE). He is an adjunct Professor at Columbia University’s Electrical Engineering Department, a Fellow of the IEEE and an elected member of the IFIP (International Federation for Information Processing) Working Group 7.3. https://www.bell-labs.com/usr/anwar.walid
Associate Professor
Laboratory of Computational Intelligence
Skoltech
Quantum in ML and ML in Quantum
In this talk, I will cover recent results in two areas: 1) Using quantum-inspired methods in machine learning, including using low-entanglement states (matrix product states/tensor train decompositions) for different regression and classification tasks. 2) Using machine learning methods for efficient classical simulation of quantum systems. I will cover our results on simulating quantum circuits on parallel computers using graph-based algorithms, and also efficient numerical methods for optimization using tensor-trains for the computational of large number (up to B=100) on GPUs. The code is a combination of classical linear algebra algorithms, Riemannian optimization methods and efficient software implementation in TensorFlow.
Ivan Oseledets has been working at Skoltech from August 2013. Prior to joining Skoltech I was working in the Institute of Numerical Mathematics of Russian Academy of Sciences. My main achievement is the introduction and development of different algorithms in the Tensor Train (TT) format. Such kind of representations have been known for many years in physics (Matrix Product States, Tensor Networks, Transfer Matrices), but these results were lying dead for the numerical mathematics. Now they are becoming an important tool in different applications in biology, chemistry and data mining.
assistant professor
at UC San Diego
Tensor Methods for Efficient and Interpretable Spatiotemporal Learning
Multivariate spatiotemporal data is ubiquitous in science and engineering, from climate science to sports analytics, to neuroscience. Such data contain higher-order correlations and can be represented as a tensor. Tensor latent factor models provide a powerful tool for reducing dimensionality and discovering higher-order structures. However, existing tensor models are often slow or fail to yield interpretable latent factors. In this talk, I will demonstrate advances in tensor methods to generate interpretable latent factors for high-dimensional spatiotemporal data. We provide theoretical guarantees and demonstrate their applications to real-world climate, basketball, and neuroscience data.
Dr. Rose Yu is an assistant professor at the University of California San Diego, Department of Computer Science and Engineering. She earned her Ph.D. in Computer Sciences at the University of Southern California in 2017. She was subsequently a Postdoctoral Fellow at the California Institute of Technology. She was an assistant professor at Northeastern University prior to her appointment at UCSD. Her research focuses on advancing machine learning techniques for large-scale spatiotemporal data analysis, with applications to sustainability, health, and physical sciences. A particular emphasis of her research is on physics-guided AI which aims to integrate first-principles with data-driven models. Among her awards, she has won Google Faculty Research Award, Adobe Data Science Research Award, NSF CRII Award, Best Dissertation Award in USC, and was nominated as one of the ’MIT Rising Stars in EECS’.
Assistant Professor
Canada CIFAR AI Chair
MILA, Université de Montréal, Canada
Tensor Network Models for Structured Data
In this talk, I will present uniform tensor network models (also known translation invariant tensor networks) which are particularly suited for modelling structured data such as sequences and trees. Uniform tensor networks are tensor networks where the core tensors appearing in the decomposition of a given tensor are all equal, which can be seen as a weight sharing mechanism in tensor networks. In the first part of the talk, I will show how uniform tensor networks are particularly suited to represent functions defined over sets of structured objects such as sequences and trees. I will then present how these models are related to classical computational models such as hidden Markov models, weighted automata, second-order recurrent neural networks and context free grammars. In the second part of the talk, I will present a classical learning algorithm for weighted automata and show how and it can be interpreted as a mean to convert non-uniform tensor networks to uniform ones. Lastly, I will present ongoing work leveraging the tensor network formalism to design efficient and versatile probabilistic models for sequence data.
Guillaume Rabusseau is an assistant professor at Univeristé de Montréal and holds a Canada CIFAR AI chair at the Mila research institute. Prior to joining Mila, he was an IVADO postdoctoral research fellow in the Reasoning and Learning Lab at McGill University, where he worked with Prakash Panangaden, Joelle Pineau and Doina Precup. He obtained his PhD in computer science in 2016 at Aix-Marseille University under the supervision of François Denis and Hachem Kadri. His research interests lie at the intersection of theoretical computer science and machine learning, and his work revolves around exploring inter-connections between tensors and machine learning to develop efficient learning methods for structured data relying on linear and multilinear algebra.
NVIDIA
cuTensor: High-Performance CUDA Tensor Primitives
This talk discusses cuTENSOR, a high-performance CUDA library for tensor operations that efficiently handles the ubiquitous presence of high-dimensional arrays (i.e., tensors) in today's HPC and DL workloads. This library supports highly efficient tensor operations such as tensor contractions, element-wise tensor operations such as tensor permutations, and tensor reductions. While providing high performance, cuTENSOR also enables users to express their mathematical equations for tensors in a straightforward way that hides the complexity of dealing with these high-dimensional objects behind an easy-to-use API.
Professor of Quantum Physics
at Freie Universität Berlin
Tensor Networks as a Data Structure in Probabilistic Modelling and for Learning Dynamical Laws from Data
Recent years have enjoyed a significant interest in exploiting tensor networks in the context of machine learning, both as a tool for the formulation of new learning algorithms and for enhancing the mathematical understanding of existing methods. In this talk, we will explore two readings of such a connection. On the one hand, we will consider the task of identifying the underlying non-linear governing equations, required both for obtaining an understanding and making future predictions. We will see that this problem can be addressed in a scalable way making use of tensor network based parameterizations for the governing equations. On the other hand, we will investigate the expressive power of tensor networks in probabilistic modelling. Inspired by the connection of tensor networks and machine learning, and the natural correspondence between tensor networks and probabilistic graphical models, we will provide a rigorous analysis of the expressive power of various tensor-network factorizations of discrete multivariate probability distributions. Joint work with A. Goeßmann, M. Götte, I. Roth, R. Sweke, G. Kutyniok, I. Glasser, N. Pancotti, J. I. Cirac.
Jens Eisert is a full professor at the Free University of Berlin and holds an affiliation with the Helmholtz Center Berlin. He has made numerous contributions to quantum information science and the study of complex quantum systems in the context of condensed matter physics. Tensor networks have been one of the main tools used in these endeavours. He is passionate about ideas of quantum computing and simulation and is increasingly getting interested in notions of machine learning. He has received several awards for his work, including an ERC grant and an EURYI award, and the Google NISQ award, and is member of the international advisory board of Zapata.
University of Ghent
Tensor networks and counting problems on the lattice
An overview will be given of counting problems on the lattice, such as the calculation of the hard square constant and of the residual entropy of ice. Unlike Monte Carlo techniques which have difficulty in calculating such quantities, we will demonstrate that tensor networks provide a natural framework for tackling these problems. We will also show that tensor networks reveal nonlocal hidden symmetries in those systems, and that the typical critical behaviour is witnessed by matrix product operators which form representations of tensor fusion categories.
Frank Verstraete is Professor at the University of Ghent. He is one of the founders of the field of quantum tensor networks. He pioneered the use of quantum entanglement as a unifying theme for describing strongly interacting quantum many body systems, which are the most challenging systems to describe but also the most promising for future quantum technologies such as quantum computers. He has received numerous awards, and is also distinguished visiting research chair at the Perimeter Institute for theoretical physics in Waterloo, Canada.
Quantum Research Scientist
AWS Center for Quantum Computing
Learning Quantum Channels with Tensor Networks
We present a new approach to quantum process tomography, the reconstruction of an unknown quantum channel from measurement data. Specifically, we combine a tensor-network representation of the Choi matrix (a complete description of a quantum channel), with unsupervised machine learning of single-shot projective measurement data. We show numerical experiments for both unitary and noisy quantum circuits, for a number of qubits well beyond the reach of standard process tomography techniques.
Giacomo is a research scientist at the AWS Center for Quantum Computing in Pasadena. Before that, he was a research fellow at the Center for Computational Quantum Physics of the Flatiron Institute in New York. His research focuses on the development of computational methods based on machine learning to investigate quantum many-body physics, ranging from strongly-correlated quantum matter to emerging quantum technologies.
Assistant Professor
at Georgia Institute of Technology
Getting Started with Tensor Networks
I will provide an overview of the tensor network formalism and its applications, and discuss the key operations, such as tensor contractions, required for building tensor network algorithms. I will also demonstrate the TensorTrace graphical interface, a software tool which is designed to allow users to implement and code tensor network routines easily and effectively. Finally, the utility of tensor networks towards tasks in machine learning will be briefly discussed.
Glen Evenbly is currently an Assistant Professor at Georgia Institute of Technology. His research is focused on the development and implementation of tensor network approaches for the efficient simulation of quantum many-body systems. In particular, he has made significant contributions to the development of the multi-scale entanglement renormalization ansatz (MERA) and its application to the study of many-body systems at criticality, and is also the developer of the TensorTrace graphical interface for creating tensor network algorithms and code. Glen received the IUPAP Young Scientist Prize in Computational Physics in 2017 for his work on developing a new class of tensor network renormalization algorithm.
independent research and software consultant
at X, the moonshot factory
TensorNetwork: A Python Package for Tensor Network Computations
TensorNetwork is an open source python package for tensor network computations. It has been designed with the goal in mind to help researchers and engineers with rapid development of highly efficient tensor network algorithms for physics and machine learning applications. After a brief introduction to tensor networks, I will discuss some of the main design principles of the TensorNetwork package, and show how one can use it to speed up tensor network algorithms by running them on accelerated hardware, or by exploiting tensor sparsity.
Martin Ganahl is currently working as independent research and software consultant at X, the moonshot factory. Prior to that he was a postdoctoral fellow at the Perimeter Institute for Theoretical Physics. His research interests are tensor network algorithms and their applications to condensed matter physics, quantum field theory, material science and machine learning. He is also the co-creator of TensorNetwork, a python library for tensor network simulations on accelerated hardware.
Associate professor at Department of Physics
Faculty of Science
Center for Mathematical and Data Sciences
Kobe University
A century of the tensor network formulation from the Ising model
A hundred years have passed since Ising model was proposed by Lenz in 1920. One finds that the square lattice Ising model is already an example of two-dimensional tensor network (TN), which is formed by contracting 4-leg tensors. In 1941, Kramers and Wannier assumed a variational state in the form of the matrix product state (MPS), and they optimized it `numerically'. Baxter reached the concept of the corner-transfer matrix (CTM), and performed a variational computation in 1968. Independently from these statistical studies, MPS was introduced by Affleck, Lieb, Kennedy and Tasaki (AKLT) in 1987 for the study of one-dimensional quantum spin chain, by Derrida for asymetric exclusion processes, and also (implicitly) by the establishment of the density matrix renormalization group (DMRG) by White in 1992. After a brief (?) introduction of these prehistories, I'll speak about my contribution to this area, the applications of DMRG and CTMRG methods to two-dimensional statistical models, including those on hyperbolic lattices, fractal systems, and random spin models. Analysis of the spin-glass state, which is related to learning processes, from the view point of the entanglement structure would be a target of future studies in this direction.
Associate professor at Department of Physics, Faculty of Science, Kobe University, and Center for Mathematical and Data Sciences, Kobe University. I'm interested in the thermal properties of statistical mechanical systems. For this purpose, I have been performing numerical analyses based on matrix and tensor product formulations, which are important backgrounds of modern tensor network studies. I encountered Baxter's textbook and also DMRG during my early stage of the research life. It should be noted that any one naturally guided to CTMRG from these two start points.