Το work with title Eye tracking interaction on unmodified mobile VR headsets using the selfie camera by Drakopoulos Panagiotis, Koulieris Georgios-Alexandros, Mania Aikaterini is licensed under Creative Commons Attribution 4.0 International
Bibliographic Citation
P. Drakopoulos, G.-A. Koulieris, and K. Mania, “Eye tracking interaction on unmodified mobile VR headsets using the selfie camera,” ACM Trans. Appl. Percept., vol. 18, no. 3, July 2021, doi: 10.1145/3456875.
https://doi.org/10.1145/3456875
Input methods for interaction in smartphone-based virtual and mixed reality (VR/MR) are currently based on uncomfortable head tracking controlling a pointer on the screen. User fixations are a fast and natural input method for VR/MR interaction. Previously, eye tracking in mobile VR suffered from low accuracy, long processing time, and the need for hardware add-ons such as anti-reflective lens coating and infrared emitters. We present an innovative mobile VR eye tracking methodology utilizing only the eye images from the front-facing (selfie) camera through the headset’s lens, without any modifications. Our system first enhances the low-contrast, poorly lit eye images by applying a pipeline of customised low-level image enhancements suppressing obtrusive lens reflections. We then propose an iris region-of-interest detection algorithm that is run only once. This increases the iris tracking speed by reducing the iris search space in mobile devices. We iteratively fit a customised geometric model to the iris to refine its coordinates. We display a thin bezel of light at the top edge of the screen for constant illumination. A confidence metric calculates the probability of successful iris detection. Calibration and linear gaze mapping between the estimated iris centroid and physical pixels on the screen results in low latency, real-time iris tracking. A formal study confirmed that our system’s accuracy is similar to eye trackers in commercial VR headsets in the central part of the headset’s field-of-view. In a VR game, gaze-driven user completion time was as fast as with head-tracked interaction, without the need for consecutive head motions. In a VR panorama viewer, users could successfully switch between panoramas using gaze.