- Generative Tensor Network Classification for Supervised Learning. [PDF] [Slide] [Video]
Ding Liu, Zheng-Zhi Sun, Cheng Peng, Gang Su and Shi-Ju Ran
Abstract: Tensor network (TN) is developing rapidly into a powerful machine learning (ML) model that is built upon quantum theories and methods. Here, we introduce the generative TN classifier (GTNC), which is demonstrated to possess unique advantages over other relevant and well-established ML models such as support vector machines and naive Bayes classifiers. In specific, the GTNC is shown to rely much less on the hyper-parameters, and to be an adaptive model that avoids over-fitting by limiting the parameter complexity according to the entanglement. GTNC paves new paths to the quantum-inspired probabilistic ML models based on TN.
- Recent Advances on Robust Tensor Principal Component Analysis. [PDF] [Slide] [Video]
Lanlan Feng, Shenghan Wang, Ce Zhu and Yipeng Liu
Abstract: The task of robust tensor principal component analysis (RTPCA) is to separate the underlying low rank component and sparse component in high-dimensional data. In RTPCA, an order-3 tensor X can be decomposed as X = L + E, where L and E represent a low-rank tensor and a sparse tensor, respectively. It can make good use of the multi-dimensional structure, which finds successful applications in background modeling, image denoising, illumination normalization for face images, etc.
- Multiary Relational Knowledge Base Completion via Tensor Decomposition. [PDF]
Yu Liu, Quanming Yao and Yong Li
Abstract: As the generalization of knowledge graph, the multiary relational knowledge base (KB) with binary and beyond-binary relational facts, is closer to the real-world knowledge, but still underexplored. In this short paper, we investigate the multiary relational knowledge base completion problem, and propose a generalized model based on Tucker decomposition and Tensor Ring decomposition. Compared to the state-of-the-art methods, the proposed model obtains a relative improvement of 7% and 9% on two benchmark multiary relational KB datasets respectively.
- Trillion-Tensor: Trillion-Scale CP Tensor Decomposition. [PDF] [Slide] [Video]
Zeliang Zhang, Xiao-Yang Liu and Pan Zhou
Abstract: Due to the storage limitation, tensors with billions or trillions of nonzero elements cannot be loaded into the main memory. Existing tensor decomposition methods only support billion-scale tensors. In this paper, we implement a compression-based algorithm trillion-tensor for trillion-scale (number of elements) CP tensor decomposition by trading computation for storage. We make full use of the parallelism of tensor operations to accelerate the proposed trillion-tensor scheme on CPUs and GPUs, respectively. In our experiments, we test tensors ranging from million-scale to trillion-scale and obtain a relatively low mean squared error. Comparing to the baseline method PARACOMP, trillion-tensor supports 8,000 larger tensors and a speedup up to 6.95×.
- Tensor Representation for Brain Signal Processing. [PDF] [Slide] [Video]
Zhe Sun, Zhiwen Zhang, Zihao Huang, Binghua Li, Feng Duan, Zipei Fan and Jordi Solé-Casals
Abstract: In the past decades, electroencephalograms (EEG) has been used widely for brain-computer interface (BCI). To detect the pattern from EEG signals, different kinds of algorithms have been developed. In our previous works, we have already proposed tensor presentation of EEG signals can improve the EEG signal classification. In another research we have already done, the tensor fusion algorithm was used for a multi-model BCI system. In the system, electroencephalography and near-infrared spectroscopy were recorded simultaneously, tensor-based fused feature vector was used for the classification. The results showed better performance than previous researches.
- Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor Decomposition. [PDF] [Slide] [Video]
Miao Yin, Siyu Liao, Xiao-Yang Liu, Xiaodong Wang and Bo Yuan
Abstract: Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling. However, when processing high-dimensional data, RNNs typically require huge model sizes, thereby bringing a series of deployment challenges. Although the state-of-the-art tensor decomposition approaches can provide good model compression performance, these existing methods are still suffering some inherent limitations, such as restricted representation capability and insufficient model complexity reduction. To overcome these limitations, in this paper, we propose to develop compact RNN models using Hierarchical Tucker (HT) decomposition. HT decomposition brings a strong hierarchical structure to the decomposed RNN models, which is very useful and important for enhancing the representation capability. Meanwhile, HT decomposition provides higher storage and computational cost reduction than the existing tensor decomposition approaches for RNN compression. Our experimental results show that, compared with the state-of-the-art compressed RNN models, such as TT-LSTM, TR-LSTM and BT-LSTM, our proposed HT-based LSTM (HT-LSTM), consistently achieves simultaneous and significant increases in both compression ratio and test accuracy on different datasets.
- cuTensor-HT: High Performance Third-order Hierarchical Tucker Tensor Decomposition on GPUs. [PDF] [Slide] [Video]
Hao Huang, Tao Zhang and Xiao-Yang Liu
Abstract: Extracting effective information from large-scale multi-dimensional data has become a hot issue, where the hierarchical Tucker (HT) tensor decomposition is a widely used tool. However, the HT tensor decomposition is a computationally intensive task since the time complexity increases rapidly with the dimension and size of the tensor. In this paper, we implement the HT decomposition on the GPU, and propose optimization strategies to improve resource utilization, including efficient memory access, reducing computation consumption and batched operations. For tensors of various sizes, the optimized GPU implementation achieves $4.67\times$ speedups over the unoptimized GPU baseline.
- Adaptive Regularizing Tucker Decomposition for Knowledge Graph Completion. [PDF] [Slide] [Video]
Quanming Yao, Shimin Di and Yongqi Zhang
Abstract: Tensor factorization approaches have recently become popular in knowledge graph completion (KGC). Among them, TuckER, which introduces Tucker decomposition in KGC, is the state-of-the-art method. However, due to its high model complexity, neither effectiveness nor efficiency of TuckER is satisfied. In this paper, we improve TuckER by automated machine learning (AutoML) techniques. Specifically, we propose to regularize the over-parameterized core tensor in Tucker by the one-shot architecture search algorithm. The resulting new factorization method not only sparsifies but also improves the interpretability of core tensor. Finally, empirical results demonstrate that the proposed method achieves state-of-the-art performance on KGC.
- cuTensor-TT/TR: High Performance Third-order Tensor-Train and Tensor-Ring Decompositions on GPUs. [PDF] [Slide] [Video]
Hao Hong, Tao Zhang and Xiao-Yang Liu
Abstract: Tensor decompositions are widely used for processing multi-dimensional data in machine learning. However, the time and space cost of tensor decompositions increases rapidly with the size and dimension of tensors. In this paper, we propose high performance GPU implementations for the third-order tensor-train (TT) decomposition and tensor-ring (TR) decomposition. First, we propose to utilize highly-parallel Jacobi-based singular value decomposition (SVD) on GPU. Second, we parallelize diagonal matrix and matrix multiplication on GPU. Thirdly, we optimize the transfer and memory access of intermediate variables to further improve performance. On a Tesla V100 GPU, we tested tensors up to 1,200 × 1,200 × 1,200. Compared with the basic GPU implementations, the proposed GPU implementations of TT and TR decompositions achieve up to 6.67× and 6.36× speedups, respectively.
- cuTensor-CP: High Performance Third-order CP Tensor Decomposition on GPUs. [PDF] [Slide] [Video]
Xiao-Yang Liu, Han Lu and Tao Zhang
Abstract: Tensor decompositions that factorize multi-dimensional data into latent factors have become a powerful tool for big data analytics and machine learning, e.g., video processing, deep learning, social networks, etc. However, time and space complexities of tensor decomposition algorithms grow rapidly with the size of tensors. Exploiting parallelisms of tensor algorithms and accelerating them on many-core GPUs are promising. In this paper, we develop efficient CP tensor decomposition on GPUs by exploiting tensor algorithm parallelism. We implement and optimize key operations, including tensor matricization and matricized tensor times Khatri-Rao product (MTTKRP). We fully optimize the data transmission and reduce memory footprint, even employ more efficient calculation processes with smaller computational complexity. Compared with the TensorLab library running on a Tesla V100 GPU, our implementation of CP decomposition achieves up to 5.56× speedup.
- Tensor Network for Supervised Learning at Finite Temperature. [PDF] [Slide] [Video]
Haoxiang Lin, Shuqian Ye and Xi Zhu
Abstract: The large variation of datasets is a huge barrier for image classification tasks. In this paper, we embraced this observation and introduce the finite temperature tensor network (FTTN), which imports the thermal perturbation into the matrix product states framework by placing all images in an environment with constant temperature, in analog to energy-based learning. Tensor network is chosen since it is the best platform to introduce thermal fluctuation. Different from traditional network structure which directly takes the summation of individual losses as its loss function, FTTN regards it as thermal average loss computed from the entanglement with the environment. The temperature-like parameter can be automatically optimized, which gives each database an individual temperature. FTTN obtains improvement in both test accuracy and convergence speed in several datasets. The non-zero temperature automatically separates similar features, avoiding the wrong classification in previous architecture. The thermal fluctuation may give a better improvement in other frameworks, and we may also implement the temperature of database to improve the training effect.
- An L1-L2 Variant of Tubal Nuclear Norm for Guaranteed Tensor Recovery. [PDF] [Slide] [Video]
Andong Wang, Guoxu Zhou, Zhong Jin and Qibin Zhao
Abstract: As a convex approximation of the tensor multi-rank which models low-rankness in the spectral domain, the Tubal Nuclear Norm (TNN) has shown superiority over traditional tensor nuclear norms in many tensor recovery tasks. However, it over-penalizes larger singular values of the Fourier block-diagonal matrix and may result in biased estimation. To this point, we define a non-convex l1 - α l2 metric to approximate tensor multi-rank and introduce it into a new tensor sensing model with guaranteed recovery performance. The proximal operator of the metric is then proposed and utilized in an alternating direction method of multiplier (ADMM)-based algorithm to solve the problem. Effectiveness of the proposed metric is evaluated on both synthetic and real data.
- Convolutional Graph-Tensor Net for Graph Data Completion. [PDF] [Slide] [Video]
Xiao-Yang Liu and Ming Zhu
Abstract: Graph data completion is a fundamentally important issue as data generally has a graph structure, e.g., social networks, recommendation systems, and the Internet of Things. We consider a graph where each node has a data matrix, represented as a graph-tensor by stacking the data matrices in the third dimension. In this paper, we propose a Convolutional Graph-Tensor Net (Conv GT-Net) for the graph data completion problem, which uses deep neural networks to learn the general transform of graph-tensors. The experimental results on the ego-Facebook data sets show that the proposed Conv GT-Net achieves significant improvements on both completion accuracy (50% higher) and completion speed (3.6x ~ 8.1x faster) over the existing algorithms.
- Tensor Decomposition via Core Tensor Networks. [PDF] [Slide] [Video]
Jianfu Zhang, Zerui Tao, Qibin Zhao and Liqing Zhang
Abstract: Tensor decomposition has shown promising performance in image completion and denoising. Existing methods always aim to decompose one tensor into latent factors or core tensors by optimizing a particular cost function based on a specific tensor model. These algorithms iteratively learn the optima from random initialization given any individual tensor, resulting in slow convergence and low efficiency. In this paper, we propose an efficient tensor decomposition algorithm which aims to learn a global mapping from input tensors to latent core tensors, under the assumption that the mappings of a bunch of tensors might be shared or highly correlated. To this end, we train a deep neural network (DNN) to model the global mapping and then apply it to decompose a newly given tensor with high efficiency. Furthermore, the initial values of DNN are learned based on meta-learning methods. By leveraging the pretrained meta-learning based core tensor DNN, our proposed method enables us to perform tensor decomposition efficiently and accurately. Experimental results demonstrate the significant improvements of our method over other tensor decomposition methods in terms of speed and accuracy.
- Bayesian Tensor Ring Decomposition for Low Rank Tensor Completion. [PDF] [Slide] [Video]
Zerui Tao and Qibin Zhao
Abstract: Recently, tensor network (TN) becomes an attractive topic in the cross discipline of physics and machine learning. By factorizing a higher-order tensor into small tensors, TNs are able to capture complex multi-linear relations within the data. However, in most of the applications, the structures of the TNs are predefined, which greatly limit the flexibility the models in diverse situations. In this paper, we aim to bring the great power of Bayesian learning into tensor ring (TR) decomposition, which is one of the most popular TN structures. The advantages of Bayesian TR model are two-folds: 1). Under the full Bayesian framework, the estimator is probabilistic and robust; 2). With the help of sparse priors, the proposed model can automatically prune redundant factors and infer the underlying structures of the data. To approximate the posterior, we establish two inference algorithms, including Gibbs sampler and VI. Also, we conduct experiments on simulation data and image inpainting tasks to show the effectiveness of the proposed model.
- Acceleration of Fractional Fourier Transforms via Tensor-train Decomposition. [PDF] [Slide] [Video]
Runjia Zhang and Chao Li
Abstract: Fractional Fourier transform (FrFT) is a generalization of the ordinary Fourier transform. In this extended abstract, we discuss tensor network enabled method to accelerate the numerical calculation of discrete FrFT, along other numerical and optical realizations. Also we discuss how the use of the proposed FrFT approach extends to optics, signal processing and differential equations, such as Schrödinger equations.
- Tensor and Tensor Networks for Machine Learning: An Hourglass Architecture. [PDF] [Slide] [Video]
Xiao-Yang Liu, Qibin Zhao and Anwar Walid
Abstract: Tensor and tensor networks are envisioned to have great potential to advance machine learning technologies. Recent works show that tensor networks provide powerful simulations of quantum machine learning algorithms on classical computers. We observe that tensor and tensor networks in machine learning exhibit a layered architecture that resembles an hourglass. In this paper, we describe a seven-layer architecture to characterize the role of tensor and tensor networks in machine learning, point out current challenges and discuss recent innovations. As a cornerstone data structure, tensor and tensor networks lie at the waist of the hourglass-shaped architecture, while the lower and upper layers tend to see frequent innovations. We expect tensor and tensor networks continue to serve as an amplifier for computational intelligence, a transformer for machine learning innovations, and a propeller for AI industrialization.