URI | http://purl.tuc.gr/dl/dias/17E8583A-E9FD-47A3-9217-D6D3CE6A642D | - |
Αναγνωριστικό | https://doi.org/10.26233/heallink.tuc.66075 | - |
Γλώσσα | en | - |
Μέγεθος | 1,2 megabytes | en |
Τίτλος | Semantic similarity computation and word sense induction using hidden sets multidimensional scaling | en |
Δημιουργός | Athanasopoulou Georgia | en |
Δημιουργός | Αθανασοπουλου Γεωργια | el |
Συντελεστής [Συν-Επιβλέπων] | Potamianos Alexandros | en |
Συντελεστής [Συν-Επιβλέπων] | Ποταμιανος Αλεξανδρος | el |
Συντελεστής [Επιβλέπων Καθηγητής] | Koutsakis Polychronis | en |
Συντελεστής [Επιβλέπων Καθηγητής] | Κουτσακης Πολυχρονης | el |
Συντελεστής [Μέλος Εξεταστικής Επιτροπής] | Liavas Athanasios | en |
Συντελεστής [Μέλος Εξεταστικής Επιτροπής] | Λιαβας Αθανασιος | el |
Εκδότης | Πολυτεχνείο Κρήτης | el |
Εκδότης | Technical University of Crete | en |
Ακαδημαϊκή Μονάδα | Technical University of Crete::School of Electrical and Computer Engineering | en |
Ακαδημαϊκή Μονάδα | Πολυτεχνείο Κρήτης::Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών | el |
Περίληψη | In this thesis, motivated by evidences in psycholinguistics and cognition, we propose an unsupervised language-agnostic Distributional Semantic Model (DSM), that utilize web harvested data, for the problem of semantic similarity estimation. Semantic similarity can be applied to numerous tasks of Natural Language Processing (NLP), such as affective text analysis and paraphrasing.
In the first part of the thesis, the construction of typical DSMs following the
well-established Vector Space Model, is presented. More specifically, we describe the creation of corpora by harvesting web documents following a query-based approach, as well as state-of-the-art DSMs used for the computation of semantic similarity from the corpora. Next, we propose a novel hierarchical distributed semantic model (DSM), that is inspired by evidence in psycholinguistics and cognition, and consists of low-dimensional manifolds built on semantic neighborhoods. Each manifold is sparsely encoded and mapped into a low-dimensional space. Global operations are decomposed into local operations in multiple sub-spaces; results from these local operations are fused to come up with semantic relatedness estimates. Manifold DSM are constructed starting from a pairwise word-level semantic similarity matrix. The proposed model is evaluated against state-of-the-art/baseline DSMs on semantic similarity estimation task, where the similarity metrics are evaluated against human similarity ratings. The proposed model significantly improve performance comparing to the baseline approaches for the task of semantic similarity estimation between words. Furthermore the proposed model is evaluated in a taxonomy task achieving achieving state-of-the-art results. Finally, motivated by evidence of cognitive organization of concepts based on the degree of concreteness, we present the performance of proposed DSM for abstract and concrete nouns. | en |
Τύπος | Μεταπτυχιακή Διατριβή | el |
Τύπος | Master Thesis | en |
Άδεια Χρήσης | http://creativecommons.org/licenses/by/4.0/ | en |
Ημερομηνία | 2016-07-20 | - |
Ημερομηνία Δημοσίευσης | 2016 | - |
Θεματική Κατηγορία | Natural language processing | en |
Θεματική Κατηγορία | NLP | en |
Θεματική Κατηγορία | Semantic similarity | en |
Βιβλιογραφική Αναφορά | Georgia Athanasopoulou, "Semantic similarity computation and word sense induction using hidden sets multidimensional scaling", Master Thesis, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2016 | en |