Parallel optimization algorithms for very large tensor decompositionsParallel optimization algorithms for very large tensor decompositionsΠαράλληλοι αλγόριθμοι βελτιστοποίησης για παραγοντοποιήσεις πολύ μεγάλων τανυστών
Διπλωματική Εργασία
Diploma Work
2019-10-042019enTensors are generalizations of matrices to higher dimensions and are very powerful tools that can model a wide variety of multi-way data dependencies. As a result, tensor decompositions can extract useful information out of multi-aspect data tensors and have witnessed increasing popularity in various fields, such as data mining, social network analysis, biomedical applications, machine learning etc. Many decompositions have been proposed, but in this thesis we focus on Tensor Rank Decomposition or Canonical Polyadic Decomposition (CPD) using Alternating Least Squares (ALS). The main goal of the CPD is to decompose tensors into a sum of rank-1 terms, a procedure more difficult than its matrix counterpart, especially for large-scale tensors. CP decomposition via ALS consists of computationally expensive operations which cause performance bottlenecks. In order to accelerate this method and overcome these obstacles, we developed two parallel versions of the ALS that implement the CPD. The first one uses the full tensor and runs in parallel on heterogeneous & shared memory systems (CPUs and GPUs). The second one decomposes the tensor in parallel using small random block samples and runs on homogeneous & shared memory systems (CPUs).http://creativecommons.org/licenses/by/4.0/Πολυτεχνείο Κρήτης::Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών ΥπολογιστώνPapagiannakos_Ioannis-Marios_Dip_2019.pdfChania [Greece]Library of TUC2019-10-04application/pdf831.5 kBfree
Papagiannakos Ioannis-Marios
Παπαγιαννακος Ιωαννης-Μαριος
Liavas Athanasios
Λιαβας Αθανασιος
Karystinos Georgios
Καρυστινος Γεωργιος
Samoladas Vasilis
Σαμολαδας Βασιλης
Πολυτεχνείο Κρήτης
Technical University of Crete
Canonical polyadic decomposition
Alternating least squares
shared memory systems
OpenMP
CUDA
Tensor
Randomized block sampling
PARAFAC
Parallel computing