Utilisation des techniques de deep learning pour l’amélioration de la gestion des occultations pour la réalité augmentée

Bekiri, Roumaissa (2024) Utilisation des techniques de deep learning pour l’amélioration de la gestion des occultations pour la réalité augmentée. Doctoral thesis, Université Mohamed Khider (Biskra - Algérie).

[img] Text
Thesis_Roumaissa.pdf

Download (13MB)

Abstract

Augmented Reality (AR) represents a groundbreaking technological frontier that seamlessly merges the digital and physical worlds. At the core of this technology lies the need for precise and intuitive interactions, and hand pose estimation has emerged as a crucial component in achieving this goal. Besides, it is considered more challenging than other human part estimations due to the small size of the hand, its greater complexity, and its important self-occlusions. In this context, we investigate the occlusion issue throughout the interaction. This dissertation proposes a classical method for resolving occlusion in a dynamic augmented reality system by employing a close-range photogrammetry algorithm. Additionally, we create realistic datasets composed of physical scenes from different viewpoint cameras. Further, we apply depth map data that proves to be a valuable strategy for effectively managing occlusion in augmented reality scenarios, which provides essential information about the spatial relationships and distances between objects in the scene and can accurately discern which objects should appear in front of or behind others. This approach has proven instrumental in addressing the persistent challenge of occlusion, allowing for seamless and contextually creating more immersive AR experiences. Then, we extend our study in the online process. We address the problem of hand pose estimation and present a new regression method from monocular RGB images, which aims to tackle occlusion issues during hand-object interaction in real-time. With the advent of deep learning, there has been a shift towards using deep neural networks to learn, grasp, and manipulate objects accurately. The proposed framework, defined as the "ResUnet network," provides effective capabilities in detecting and predicting both 2D and 3D hand pose. This is achieved by utilizing three primary modules: feature extraction, which employs a transfer learning technique to extract feature maps; 2D pose regression; and 3D hand estimate. Our regression methodology consistently outperforms the current state-of-the-art hand pose estimation approaches, as demonstrated by the quantitative and qualitative findings obtained from three datasets.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: Augmented Reality, Occlusion, hand pose estimation, deep learning, Humancomputer interaction, 2D pose, 3D pose
Subjects: Q Science > Q Science (General)
Divisions: Faculté des Sciences Exactes et des Sciences de la Nature et de la Vie > Département d'informatique
Depositing User: BFSE
Date Deposited: 20 Mar 2024 07:58
Last Modified: 20 Mar 2024 07:58
URI: http://thesis.univ-biskra.dz/id/eprint/6406

Actions (login required)

View Item View Item