Institutional Repository
Technical University of Crete
EN  |  EL



My Space

Efficient parallel algoriths for tensor decompositions and implementation in distributed environments

Kolomvakis Christos

Full record

Year 2019
Type of Item Diploma Work
Bibliographic Citation Christos Kolomvakis, "Efficient parallel algoriths for tensor decompositions and implementation in distributed environments ", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2019
Appears in Collections


Tensors are mathematical objects that have recently become popular in the fields of machine learning and signal processing. This is due to their ability to describe multi-way relations and dependencies. For a complete data set, where we have no missing entries, tensor decompositions can help us model the data set, while others give us the potential to perform compression. For a data set with missing entries, we can use decompositions to infer missing values. Tensors however, suffer from the curse of dimensionality. The number of elements increases exponentially with respect to the order of a tensor. As a result, it is necessary to develop fast and scalable algorithms to perform these decompositions.In this thesis, we focus on the CP decomposition, or PARAFAC decomposition. We begin this work by introducing some definitions from linear algebra, followed by matrix least squares problems. These chapters are succeded by an introduction to tensor algebra, and a presentation of the PARAFAC decomposition. The problem we deal with in this thesis is the speeding of a matrix multiplication during the decomposition algorithm. This product is known in the bibliography as the matricized--tensor Khatri-Rao product (MTTKRP) and constitutes the computational bottleneck of the algorithm. To this end, we use a data structure called a dimension tree. The tree stores intermediate results that can be reused in the decomposition algorithm, in order to speed up the MTTKRP.Finally, we describe a parallel implementation in a distributed environment of this algorithm, and present results for various tensors and a range of processors.

Available Files