Professor of Computer Science

the Hebrew University of Jerusalem

Expressiveness in Deep Learning via Tensor Networks and Quantum Entanglement

Understanding deep learning calls for addressing three fundamental questions: expressiveness, optimization and generalization. This talk will describe a series of works aimed at unraveling some of the mysteries behind expressiveness. I will begin by showing that state of the art deep learning architectures, such as convolutional networks, can be represented as tensor networks --- a prominent computational model for quantum many-body simulations. This connection will inspire the use of quantum entanglement for defining measures of data dependencies modeled by deep networks. Next, I will turn to derive a quantum max-flow / min-cut theorem characterizing the entanglement captured by deep networks. The theorem will give rise to new results that shed light on expressiveness in deep learning, and in addition, provide new tools for deep network design.

Works covered in the talk were in collaboration with Yoav Levine, Or Sharir, Ronen Tamari, David Yakira and Nadav Cohen.

Professor Amnon Shashua is senior vice president at Intel Corporation and president and chief executive officer of Mobileye, an Intel company. He leads Intel’s global organization focused on advanced driving assist systems (ADAS), highly autonomous and fully autonomous driving solutions and programs.

Shashua joined Intel in 2017 with Intel’s acquisition of Mobileye N.V., where he served as co-founder, CTO and chairman. Mobileye was launched in 1999 with the belief that vision-safety technology will make roads safer, reduce traffic congestion and save lives. Today, Mobileye, an Intel company, is a leading supplier of system-on-chip solutions with computer vision and machine learning software that enable ADAS for collision avoidance. Since its initial public offering in 2014, Mobileye has also positioned its technology for the development of autonomous driving with novel technologies in the area of high-definition mapping through crowdsourcing, while leveraging ADAS cameras and artificial intelligence technologies for driving policy that enable robotic cars to negotiate through dense traffic.

Prof. Shashua holds the Sachs Chair in computer science at the Hebrew University of Jerusalem. His field of expertise is computer vision and machine learning with emphasis on theoretical studies of deep networks. Prof. Shashua has received many awards for his research over the years, has published more than 120 scientific papers, and continues to be an active academic researcher.

Bren Professor at CalTech

director of machine learning research at NVIDIA

(live presentation)

(live presentation)

Anima Anandkumar is a Bren Professor at Caltech and Director of ML Research at NVIDIA. She was previously a Principal Scientist at Amazon Web Services. She has received several honors, such as Alfred. P. Sloan Fellowship, NSF Career Award, Young investigator awards from DoD, and Faculty Fellowships from Microsoft, Google, Facebook, and Adobe. She is part of the World Economic Forum's Expert Network. She is passionate about designing principled AI algorithms and applying them in interdisciplinary applications. Her research focus is on unsupervised AI, optimization, and tensor methods.

Director of the Network Intelligence

and Distributed Systems Research Group

(Nokia-Bell Labs)

High Performance Computation for Tensor Networks Learning

In this talk, we study high performance computation for tensor networks to address time and space complexities that grow rapidly with the tensor size. We propose efficient primitives that exploit parallelism in tensor learning for efficient implementation on GPU.

Anwar Walid is Director of Network Intelligence and Distributed Systems Research, and a Distinguished Member of Research at Bell Labs (Murray Hill, N.J.). He also served at Bell Labs as Head of the Mathematics of Systems Research Department, and Director of Global University Research Partnerships. He received Ph.D. from Columbia University, and B.S. and M.S. from New York University. He has over 20 US and international granted patents on various aspects of computing, communications and networking. He received awards from the ACM and IEEE, including the 2019 ACM SIGCOMM Networking Systems Award for “development of a networking system that has had a significant impact on the world of computer networking” and the 2017 IEEE Communications Society William R. Bennett Prize. Dr. Walid has served on various editorial boards including IEEE IoT Journal – 2019 Special Issue on AI-Enabled Cognitive Communications and Networking for IoT, and IEEE Transactions on Network Science. He served as General Co-Chair of 2018 IEEE/ACM Conference on Connected Health (CHASE). He is an adjunct Professor at Columbia University’s Electrical Engineering Department, a Fellow of the IEEE and an elected member of the IFIP (International Federation for Information Processing) Working Group 7.3. https://www.bell-labs.com/usr/anwar.walid

Associate Professor

Laboratory of Computational Intelligence

Skoltech

Quantum in ML and ML in Quantum

In this talk, I will cover recent results in two areas: 1) Using quantum-inspired methods in machine learning, including using low-entanglement states (matrix product states/tensor train decompositions) for different regression and classification tasks. 2) Using machine learning methods for efficient classical simulation of quantum systems. I will cover our results on simulating quantum circuits on parallel computers using graph-based algorithms, and also efficient numerical methods for optimization using tensor-trains for the computational of large number (up to B=100) on GPUs. The code is a combination of classical linear algebra algorithms, Riemannian optimization methods and efficient software implementation in TensorFlow.

1. Rakhuba, M., Novikov, A. and Oseledets, I., 2019. Low-rank Riemannian eigensolver for high-dimensional Hamiltonians. Journal of Computational Physics, 396, pp.718-737.

2. Schutski, Roman, Danil Lykov, and Ivan Oseledets. Adaptive algorithm for quantum circuit simulation. Physical Review A 101, no. 4 (2020): 042335.

3. Khakhulin, Taras, Roman Schutski, and Ivan Oseledets. Graph Convolutional Policy for Solving Tree Decomposition via Reinforcement Learning Heuristics. arXiv preprint arXiv:1910.08371 (2019).

Ivan Oseledets has been working at Skoltech from August 2013. Prior to joining Skoltech I was working in the Institute of Numerical Mathematics of Russian Academy of Sciences. My main achievement is the introduction and development of different algorithms in the Tensor Train (TT) format. Such kind of representations have been known for many years in physics (Matrix Product States, Tensor Networks, Transfer Matrices), but these results were lying dead for the numerical mathematics. Now they are becoming an important tool in different applications in biology, chemistry and data mining.

assistant professor

at UC San Diego

Tensor Methods for Efficient and Interpretable Spatiotemporal Learning

Multivariate spatiotemporal data is ubiquitous in science and engineering, from climate science to sports analytics, to neuroscience. Such data contain higher-order correlations and can be represented as a tensor. Tensor latent factor models provide a powerful tool for reducing dimensionality and discovering higher-order structures. However, existing tensor models are often slow or fail to yield interpretable latent factors. In this talk, I will demonstrate advances in tensor methods to generate interpretable latent factors for high-dimensional spatiotemporal data. We provide theoretical guarantees and demonstrate their applications to real-world climate, basketball, and neuroscience data.

Dr. Rose Yu is an assistant professor at the University of California San Diego, Department of Computer Science and Engineering. She earned her Ph.D. in Computer Sciences at the University of Southern California in 2017. She was subsequently a Postdoctoral Fellow at the California Institute of Technology. She was an assistant professor at Northeastern University prior to her appointment at UCSD. Her research focuses on advancing machine learning techniques for large-scale spatiotemporal data analysis, with applications to sustainability, health, and physical sciences. A particular emphasis of her research is on physics-guided AI which aims to integrate first-principles with data-driven models. Among her awards, she has won Google Faculty Research Award, Adobe Data Science Research Award, NSF CRII Award, Best Dissertation Award in USC, and was nominated as one of the ’MIT Rising Stars in EECS’.

Assistant Professor

Canada CIFAR AI Chair

MILA, Université de Montréal, Canada

Tensor Network Models for Structured Data

In this talk, I will present uniform tensor network models (also known translation invariant tensor networks) which are particularly suited for modelling structured data such as sequences and trees. Uniform tensor networks are tensor networks where the core tensors appearing in the decomposition of a given tensor are all equal, which can be seen as a weight sharing mechanism in tensor networks. In the first part of the talk, I will show how uniform tensor networks are particularly suited to represent functions defined over sets of structured objects such as sequences and trees. I will then present how these models are related to classical computational models such as hidden Markov models, weighted automata, second-order recurrent neural networks and context free grammars. In the second part of the talk, I will present a classical learning algorithm for weighted automata and show how and it can be interpreted as a mean to convert non-uniform tensor networks to uniform ones. Lastly, I will present ongoing work leveraging the tensor network formalism to design efficient and versatile probabilistic models for sequence data.

Guillaume Rabusseau is an assistant professor at Univeristé de Montréal and holds a Canada CIFAR AI chair at the Mila research institute. Prior to joining Mila, he was an IVADO postdoctoral research fellow in the Reasoning and Learning Lab at McGill University, where he worked with Prakash Panangaden, Joelle Pineau and Doina Precup. He obtained his PhD in computer science in 2016 at Aix-Marseille University under the supervision of François Denis and Hachem Kadri. His research interests lie at the intersection of theoretical computer science and machine learning, and his work revolves around exploring inter-connections between tensors and machine learning to develop efficient learning methods for structured data relying on linear and multilinear algebra.

NVIDIA

cuTensor: High-Performance CUDA Tensor Primitives

This talk discusses cuTENSOR, a high-performance CUDA library for tensor operations that efficiently handles the ubiquitous presence of high-dimensional arrays (i.e., tensors) in today's HPC and DL workloads. This library supports highly efficient tensor operations such as tensor contractions, element-wise tensor operations such as tensor permutations, and tensor reductions. While providing high performance, cuTENSOR also enables users to express their mathematical equations for tensors in a straightforward way that hides the complexity of dealing with these high-dimensional objects behind an easy-to-use API.

Professor of Quantum Physics

at Freie Universität Berlin

Tensor Networks as a Data Structure in Probabilistic Modelling and for Learning Dynamical Laws from Data

Recent years have enjoyed a significant interest in exploiting tensor networks in the context of machine learning, both as a tool for the formulation of new learning algorithms and for enhancing the mathematical understanding of existing methods. In this talk, we will explore two readings of such a connection. On the one hand, we will consider the task of identifying the underlying non-linear governing equations, required both for obtaining an understanding and making future predictions. We will see that this problem can be addressed in a scalable way making use of tensor network based parameterizations for the governing equations. On the other hand, we will investigate the expressive power of tensor networks in probabilistic modelling. Inspired by the connection of tensor networks and machine learning, and the natural correspondence between tensor networks and probabilistic graphical models, we will provide a rigorous analysis of the expressive power of various tensor-network factorizations of discrete multivariate probability distributions. Joint work with A. Goeßmann, M. Götte, I. Roth, R. Sweke, G. Kutyniok, I. Glasser, N. Pancotti, J. I. Cirac.

Jens Eisert is a full professor at the Free University of Berlin and holds an affiliation with the Helmholtz Center Berlin. He has made numerous contributions to quantum information science and the study of complex quantum systems in the context of condensed matter physics. Tensor networks have been one of the main tools used in these endeavours. He is passionate about ideas of quantum computing and simulation and is increasingly getting interested in notions of machine learning. He has received several awards for his work, including an ERC grant and an EURYI award, and the Google NISQ award, and is member of the international advisory board of Zapata.

University of Ghent

Tensor networks and counting problems on the lattice

An overview will be given of counting problems on the lattice, such as the calculation of the hard square constant and of the residual entropy of ice. Unlike Monte Carlo techniques which have difficulty in calculating such quantities, we will demonstrate that tensor networks provide a natural framework for tackling these problems. We will also show that tensor networks reveal nonlocal hidden symmetries in those systems, and that the typical critical behaviour is witnessed by matrix product operators which form representations of tensor fusion categories.

Frank Verstraete is Professor at the University of Ghent. He is one of the founders of the field of quantum tensor networks. He pioneered the use of quantum entanglement as a unifying theme for describing strongly interacting quantum many body systems, which are the most challenging systems to describe but also the most promising for future quantum technologies such as quantum computers. He has received numerous awards, and is also distinguished visiting research chair at the Perimeter Institute for theoretical physics in Waterloo, Canada.

Quantum Research Scientist

AWS Center for Quantum Computing

Learning Quantum Channels with Tensor Networks

We present a new approach to quantum process tomography, the reconstruction of an unknown quantum channel from measurement data. Specifically, we combine a tensor-network representation of the Choi matrix (a complete description of a quantum channel), with unsupervised machine learning of single-shot projective measurement data. We show numerical experiments for both unitary and noisy quantum circuits, for a number of qubits well beyond the reach of standard process tomography techniques.

Giacomo is a research scientist at the AWS Center for Quantum Computing in Pasadena. Before that, he was a research fellow at the Center for Computational Quantum Physics of the Flatiron Institute in New York. His research focuses on the development of computational methods based on machine learning to investigate quantum many-body physics, ranging from strongly-correlated quantum matter to emerging quantum technologies.

Assistant Professor

at Georgia Institute of Technology

Getting Started with Tensor Networks

I will provide an overview of the tensor network formalism and its applications, and discuss the key operations, such as tensor contractions, required for building tensor network algorithms. I will also demonstrate the TensorTrace graphical interface, a software tool which is designed to allow users to implement and code tensor network routines easily and effectively. Finally, the utility of tensor networks towards tasks in machine learning will be briefly discussed.

Glen Evenbly is currently an Assistant Professor at Georgia Institute of Technology. His research is focused on the development and implementation of tensor network approaches for the efficient simulation of quantum many-body systems. In particular, he has made significant contributions to the development of the multi-scale entanglement renormalization ansatz (MERA) and its application to the study of many-body systems at criticality, and is also the developer of the TensorTrace graphical interface for creating tensor network algorithms and code. Glen received the IUPAP Young Scientist Prize in Computational Physics in 2017 for his work on developing a new class of tensor network renormalization algorithm.

independent research and software consultant

at X, the moonshot factory

TensorNetwork: A Python Package for Tensor Network Computations

TensorNetwork is an open source python package for tensor network computations. It has been designed with the goal in mind to help researchers and engineers with rapid development of highly efficient tensor network algorithms for physics and machine learning applications. After a brief introduction to tensor networks, I will discuss some of the main design principles of the TensorNetwork package, and show how one can use it to speed up tensor network algorithms by running them on accelerated hardware, or by exploiting tensor sparsity.

Martin Ganahl is currently working as independent research and software consultant at X, the moonshot factory. Prior to that he was a postdoctoral fellow at the Perimeter Institute for Theoretical Physics. His research interests are tensor network algorithms and their applications to condensed matter physics, quantum field theory, material science and machine learning. He is also the co-creator of TensorNetwork, a python library for tensor network simulations on accelerated hardware.

Faculty of Science

Center for Mathematical and Data Sciences

Kobe University

A century of the tensor network formulation from the Ising model

A hundred years have passed since Ising model was proposed by Lenz in 1920. One finds that the square lattice Ising model is already an example of two-dimensional tensor network (TN), which is formed by contracting 4-leg tensors. In 1941, Kramers and Wannier assumed a variational state in the form of the matrix product state (MPS), and they optimized it `numerically'. Baxter reached the concept of the corner-transfer matrix (CTM), and performed a variational computation in 1968. Independently from these statistical studies, MPS was introduced by Affleck, Lieb, Kennedy and Tasaki (AKLT) in 1987 for the study of one-dimensional quantum spin chain, by Derrida for asymetric exclusion processes, and also (implicitly) by the establishment of the density matrix renormalization group (DMRG) by White in 1992. After a brief (?) introduction of these prehistories, I'll speak about my contribution to this area, the applications of DMRG and CTMRG methods to two-dimensional statistical models, including those on hyperbolic lattices, fractal systems, and random spin models. Analysis of the spin-glass state, which is related to learning processes, from the view point of the entanglement structure would be a target of future studies in this direction.

Associate professor at Department of Physics, Faculty of Science, Kobe University, and Center for Mathematical and Data Sciences, Kobe University. I'm interested in the thermal properties of statistical mechanical systems. For this purpose, I have been performing numerical analyses based on matrix and tensor product formulations, which are important backgrounds of modern tensor network studies. I encountered Baxter's textbook and also DMRG during my early stage of the research life. It should be noted that any one naturally guided to CTMRG from these two start points.