FACESiR: face and speaker identity recognition in video streams

Karageorgiadis Anastasios

Full record

URI:

http://purl.tuc.gr/dl/dias/8CF4DC95-DFB4-4E5F-8429-C5510F81BE56

Year

2019

Type of Item

Diploma Work

License

Details

Bibliographic Citation

Anastasios Karageorgiadis, "FACESiR: face and speaker identity recognition in video streams", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2019 https://doi.org/10.26233/heallink.tuc.84791

Appears in Collections

Diploma Works in Community School of Electrical and Computer Engineering

Diploma Works in Community Intelligent Systems Laboratory

Summary

Person indexing in video streams requires first to recognize a person’s identity and secondly finding the time slot in which a person appears. In this diploma thesis, we develop a method for identifying exposed speakers within a video stream using machine learning techniques. More specifically, with the help of Neural Networks, after we exploit the structure of a video as a sequence of images and sounds, we use these data for the identification of a speaker at each video frame. The above problem is divided into two sub-problems, Face Recognition and Speaker Recognition, where we use a top-down design to split them into smaller ones. Each sub-problem is solved individually, but the combination of their outputprobabilities per class leads to an improved final decision regarding classification. The method has been implemented in the Python programming language using the Tensorflow framework and the Keras API.The suggested approach is based on Convolutional Neural Network architectures for both Face and Speaker Recognition. As a result, the combination of image and sound leads to a better decision for the identity of a person who appears in a specific time slot of the video. In addition, the main advantage of the proposed method is that it can be utilized for many different use cases, such as search for missing persons, recognition of celebrities, or even promotion of public figures. It is also worth mentioning that with some minor changes it can be used for identifying any other entity in a video stream.

Search

Browse

My Space

FACESiR: face and speaker identity recognition in video streams

Karageorgiadis Anastasios

Summary

Available Files

Services

Export

Share

Statistics

Metadata & Content in a METS Package:

Metadata in Format: