Visual localization in unstructured environments through deep learning

Petrakis Georgios

URI	http://purl.tuc.gr/dl/dias/518179D9-9064-427B-A97C-7086D7CC626C	-
Identifier	https://doi.org/10.26233/heallink.tuc.97894	-
Language	en	-
Extent	145 pages	en
Extent	9 megabytes	en
Title	Visual localization in unstructured environments through deep learning	en
Title	Οπτικός εντοπισμός σε μη δομημένα περιβάλλοντα μέσω βαθιάς μάθησης	el
Creator	Petrakis Georgios	en
Creator	Πετρακης Γεωργιος	el
Contributor [Thesis Supervisor]	Partsinevelos Panagiotis	en
Contributor [Thesis Supervisor]	Παρτσινεβελος Παναγιωτης	el
Contributor [Committee Member]	Lagoudakis Michail	en
Contributor [Committee Member]	Λαγουδακης Μιχαηλ	el
Contributor [Committee Member]	Chalkiadakis Georgios	en
Contributor [Committee Member]	Χαλκιαδακης Γεωργιος	el
Contributor [Committee Member]	Mertikas Stylianos	en
Contributor [Committee Member]	Μερτικας Στυλιανος	el
Contributor [Committee Member]	Geroliminis, Nikolas	en
Contributor [Committee Member]	Ioannidis Charalabos	en
Contributor [Committee Member]	Doulamis, Anastasios	en
Publisher	Πολυτεχνείο Κρήτης	el
Publisher	Technical University of Crete	en
Academic Unit	Technical University of Crete::School of Mineral Resources Engineering	en
Academic Unit	Πολυτεχνείο Κρήτης::Σχολή Μηχανικών Ορυκτών Πόρων	el
Description	Διδακτορική διατριβή που υποβλήθηκε στην σχολή ΜΗΧΟΠ του Πολυτεχνείου Κρήτης για την πλήρωση των προϋποθέσεων λήψης του διδακτορικού διπλώματος	en
Content Summary	Scene understanding, localization and mapping, play a crucial role in computer vision, robotics and geomatics, providing valuable knowledge through a vast and increasing number of methodologies and applications. However, although the literature flourishes with related studies in urban and indoor environments, far fewer studies concentrate in unstructured environments. The main goal of this dissertation is to design and develop a visual localization framework based on deep learning that aims to enhance scene understanding and the potential of autonomous navigation in challenging unstructured scenes and develop a precise positioning methodology, for characteristic point localization in GNSS-denied environments. The dissertation can be divided in five different parts: (a) design of the training and evaluation datasets, (b) implementation and improvement of a keypoint detection and description neural network for unstructured environments (c) implementation and development of a lightweight neural network for visual localization focused on unstructured environments and integration of the trained model in a SLAM (Simultaneous Localization and Mapping) system as a feature extraction module (d) development of a lightweight encoder-decoder architecture for lunar ground segmentation (e) development of a precise positioning and mapping alternative for GNSS-denied environments. Regarding the first part of the dissertation, two datasets were designed and created for the training and evaluation of keypoint detectors and descriptors. The training dataset includes 48 000 of FPV (First-Person-View) images with wide range of variations in landscapes, including images from Earth, Moon and Mars while the evaluation dataset includes about 120 sequences of planetary-(like) scenes where each sequence contains the original image and five different generated representations of the same scene, in terms of illumination and viewpoint. In the second part of this dissertation, a self-supervised neural network architecture called SuperPoint was implemented and modified, investigating its efficiency in keypoint detection and description applied in unstructured and planetary scenes. Three different SuperPoint models were produced: (a) an original SuperPoint model trained from scratch, (b) an original fine-tuned SuperPoint model, (c) an optimized SuperPoint model trained from scratch. The experimentation proved that the optimized SuperPoint model provides superior performance, compared with the original SuperPoint models and handcrafted keypoint detectors and descriptors. Concerning the third part of the dissertation, a multi-task deep learning architecture is developed for keypoint detection and description, focused on poor-featured unstructured and planetary scenes with low or changing illumination while the training and evaluation processes were conducted using the proposed datasets. Moreover, the trained model was integrated in a visual SLAM (Simultaneous Localization and Maping) system as a feature extraction module, and tested in two feature-poor unstructured areas. Regarding the results, the proposed architecture provides increased accuracy in terms of keypoint description, outperforming well-known handcrafted algorithms while the proposed SLAM achieved superior results in areas with medium and low illumination compared with the ORB-SLAM2 algorithm. In the fourth part of the dissertation, a lightweight encoder-decoder neural network (NN) architecture is proposed for rover-based ground segmentation on the lunar surface. The proposed architecture is composed by a modified MobilenetV2 as encoder and a lightweight U-net decoder while the training and evaluation process were conducted using a publicly available synthetic dataset with lunar landscape images. The proposed model provides robust segmentation results, achieving similar accuracy with the original U-net and U-net-based architectures which are 110 - 140 times larger than the proposed architecture. This study, aims to contribute in lunar ground segmentation utilizing deep learning techniques, while it proves a significant potential in autonomous lunar navigation ensuring a safer and smoother navigation on the moon. Regarding the fifth part of the dissertation, a precise positioning alternative was developed aiming to localize fiducial markers and characteristic points of the scene, providing their local coordinates in 3D space under a high level of accuracy. At first, the fiducial markers are placed in the scene where one of them is used as the origin marker, while the target markers represent the characteristic points or features. Subsequently, the proposed SLAM algorithm enables an RGB-Depth camera to map the desired area and localize itself in an unknown and challenging environment, while in combination with geometrical transformations, localization and optimization techniques, the present methodology estimates the coordinates of target markers and an arbitrary point cloud which approximates the structure of the environment. It is clear that the use of deep learning in unstructured and planetary environments in terms of scene recognition, localization and mapping provides a significant potential for the future applications, reinforcing crucial topics such as autonomous navigation in hazardous and unknown environments. This dissertation aspires to encourage the investigation and development of AI models and datasets, focused on planetary exploration missions and especially on high and low-level scene understanding using computationally efficient equipment and methods, reducing the economic and energy costs of robotic systems.	en
Type of Item	Διδακτορική Διατριβή	el
Type of Item	Doctoral Dissertation	en
License	http://creativecommons.org/licenses/by/4.0/	en
Date of Item	2023-11-01	-
Date of Publication	2023	-
Subject	Localization	en
Subject	Deep learning	en
Subject	Computer vision	en
Bibliographic Citation	Georgios Petrakis, "Visual localization in unstructured environments through deep learning", Doctoral Dissertation, School of Mineral Resources Engineering, Technical University of Crete, Chania, Greece, 2023	en
Bibliographic Citation	Γεώργιος Πετράκης, "Οπτικός εντοπισμός σε μη δομημένα περιβάλλοντα μέσω βαθιάς μάθησης", Διδακτορική Διατριβή, Σχολή Μηχανικών Ορυκτών Πόρων, Πολυτεχνείο Κρήτης, Χανιά, Ελλάς, 2023	el

Search

Browse

My Space

Visual localization in unstructured environments through deep learning

Petrakis Georgios

Available Files

Services

Export

Share

Statistics

Metadata & Content in a METS Package:

Metadata in Format: