Institutional Repository
Technical University of Crete
EN  |  EL



My Space

Context-aware gaze prediction applied to game level design, level-of-detail and stereo manipulation

Koulieris Georgios-Alexandros

Full record

Year 2015
Type of Item Doctoral Dissertation
Bibliographic Citation Georgios-Alexandros Koulieris, "Context-aware gaze prediction applied to game level design, level-of-detail and stereo manipulation", Doctoral Dissertation, School of Electronic and Computer Engineering, Technical University of Crete, Chania, Greece, 2015
Appears in Collections


The prediction of visual attention can significantly improve many aspects of computer graphics and games. For example, image synthesis can be accelerated by reducing complex computations on non-attended scene regions and Level-of-Detail rendering improved. Current gaze prediction models often fail to accurately predict user fixations mostly due to the fact that they include limited or even no information about the context of the scene; they commonly rely on low level image features such as luminance, contrast and motion or pre-determined task restrictions on attention to predict user gaze. These features do not drive user attention reliably when interacting with an interactive synthetic scene, e.g. in a video game. In such cases the user is in control of the view-port often consciously ignoring low level salient features in order to navigate the scene or perform a task. This dissertation contributes two novel predictive scene context-based models of attention that yield more accurate attention predictions than those derived from state-of-the-art low level image saliency methods. Both models presented take into account critical high level scene context features such as object topology and task-related object function that influence fixation guidance when gazing at interactive content. Developing the models was a challenging problem, since qualitative features such as object topology, inter-object relationships and tasks had to be quantified and formally considered in order to generate probabilities of object attendance based on subjective features. By acknowledging high level contextual features we were able to develop gaze predictors that accurately predict gaze in cases where low level image-based predictors fail. The first model is an automated high level saliency predictor that incorporates six hypotheses/factors from perception and cognitive science which can be adapted to different tasks. The first hypothesis states that a scene is comprised of objects expected to be found in a specific context as well objects out of context which are salient (scene schemata). The second claims that viewer's attention is captured by isolated objects (singletons). We employ an object-intrinsic factor accounting for canonical form of objects, an object-context factor for contextual isolation of objects, a feature uniqueness term that accounts for the number of salient features in an image and a temporal context that generates recurring fixations for objects inconsistent with the context. We extended Eckstein's Differential Weighting Model by incorporating these six hypotheses. We then conducted a formal eye-tracking experiment which confirmed that object saliency guides attention to specific objects in a game scene and determined appropriate parameters for this model. We present a GPU based system architecture that estimates the probabilities of objects to be attended in real-time. We embedded this tool in a game level editor to automatically adjust game level difficulty based on object saliency, offering a novel way to facilitate game design. We perform a study confirming that game level completion time depends on object topology as predicted by our system. We then develop an attention-based Level-of-Detail manager that downgrades the quality of areas that are expected to go unnoticed by an observer to economize on computational resources. Our system (C-LOD) maintains a constant frame rate on mobile devices by dynamically re-adjusting material quality on secondary visual features (e.g. subsurface scattering) of non-attended objects. In a proof of concept study we establish that by incorporating C-LOD, complex effects such as parallax occlusion mapping usually omitted in mobile devices can now be employed, without overloading GPU capability and, at the same time, conserving battery power. We then develop our second model, addressing the challenge of developing a gaze predictor in the demanding context of real-time, heavily task-oriented applications such as games. Our key observation is that player actions are highly correlated with the present state of a game, encoded by game variables. Based on this, we train a classifier to learn these correlations using an eye-tracker which provides the ground-truth object being looked at. The classifier is used at runtime to predict object category -- and thus gaze -- during game play, based on the current state of game variables. We evaluate the quality of our gaze predictor numerically and experimentally, showing that it predicts gaze more accurately than previous image-based approaches. Given that comfortable, high-quality 3D stereo viewing is becoming a requirement for interactive applications today, we use this prediction to propose a dynamic local disparity manipulation method, which provides rich and comfortable depth in sharp contrast to previous global disparity methods that suffer from extreme depth compression (cardboarding). A subjective rating study demonstrates that our localized disparity manipulation is preferred over previous methods.

Available Files