Advanced Scene Perception for Augmented Reality

A special issue of Journal of Imaging (ISSN 2313-433X). This special issue belongs to the section "Mixed, Augmented and Virtual Reality".

Deadline for manuscript submissions: closed (31 December 2021) | Viewed by 23233

Special Issue Editors


E-Mail Website
Guest Editor
German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany
Interests: 3D computer vision, augmented reality, SLAM, sensor fusion, activity/workflow modelling and recognition, semantic segmentation, hand-object interaction, real-time edge AI for AR

E-Mail Website
Guest Editor
German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany
Interests: 3D computer vision; augmented reality; object pose estimation and tracking; machine learning; sensor fusion; domain adaptation; SLAM; 3D sensin; semantic reconstruction; scan-to-CAD
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Augmented Reality (AR), combining virtual elements with the real world, has shown impressive results in a variety of application fields and has gained significant attention in recent years due to its limitless potential. AR applications rely heavily on the quality and extent of understanding of the user’s surroundings as well as the dynamic monitoring of the user’s interactions with his environment. While traditional AR relied on the precise localization of the user, nowadays a deeper scene perception at multiple levels is expected, ranging from dense environment reconstruction and semantic understanding to hand–object interaction and action recognition. An advanced, efficient understanding of surroundings enables AR applications that support full interaction between real and virtual elements and are able to monitor and support users reliably in real-world complex tasks such as industrial maintenance or medical procedures.

In this Special Issue, we aim to feature novel research that advances the state-of-the-art in scene perception for AR contributing to topics including semantic SLAM, object pose estimation and tracking, dynamic scene analysis, 3D environmental sensing and sensor fusion, hand tracking and hand–object interaction, illumination reconstruction. Comprehensive state-of-the-art reviews on relevant topics and innovative AR applications taking advantage of recent scene perception developments are also highly welcome.

Prof. Dr. Didier Stricker
Dr. Jason Rambach
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Journal of Imaging is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Augmented Reality
  • 3D Computer Vision
  • Semantic SLAM
  • Object detection/pose and shape
  • Machine Learning/ Deep Learning
  • Hand-Object Interaction
  • 3D Sensing
  • AR and Edge AI

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review, Other

2 pages, 160 KiB  
Editorial
Advanced Scene Perception for Augmented Reality
by Jason Rambach and Didier Stricker
J. Imaging 2022, 8(10), 287; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8100287 - 17 Oct 2022
Cited by 1 | Viewed by 1174
Abstract
Augmented reality (AR), combining virtual elements with the real world, has demonstrated impressive results in a variety of application fields and gained significant research attention in recent years due to its limitless potential [...] Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)

Research

Jump to: Editorial, Review, Other

12 pages, 14061 KiB  
Article
Efficient and Scalable Object Localization in 3D on Mobile Device
by Neetika Gupta and Naimul Mefraz Khan
J. Imaging 2022, 8(7), 188; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8070188 - 08 Jul 2022
Cited by 4 | Viewed by 1821
Abstract
Two-Dimensional (2D) object detection has been an intensely discussed and researched field of computer vision. With numerous advancements made in the field over the years, we still need to identify a robust approach to efficiently conduct classification and localization of objects in our [...] Read more.
Two-Dimensional (2D) object detection has been an intensely discussed and researched field of computer vision. With numerous advancements made in the field over the years, we still need to identify a robust approach to efficiently conduct classification and localization of objects in our environment by just using our mobile devices. Moreover, 2D object detection limits the overall understanding of the detected object and does not provide any additional information in terms of its size and position in the real world. This work proposes an object localization solution in Three-Dimension (3D) for mobile devices using a novel approach. The proposed method works by combining a 2D object detection Convolutional Neural Network (CNN) model with Augmented Reality (AR) technologies to recognize objects in the environment and determine their real-world coordinates. We leverage the in-built Simultaneous Localization and Mapping (SLAM) capability of Google’s ARCore to detect planes and know the camera information for generating cuboid proposals from an object’s 2D bounding box. The proposed method is fast and efficient for identifying everyday objects in real-world space and, unlike mobile offloading techniques, the method is well designed to work with limited resources of a mobile device. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

13 pages, 26935 KiB  
Article
Comparing Desktop vs. Mobile Interaction for the Creation of Pervasive Augmented Reality Experiences
by Tiago Madeira, Bernardo Marques, Pedro Neves, Paulo Dias and Beatriz Sousa Santos
J. Imaging 2022, 8(3), 79; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8030079 - 18 Mar 2022
Cited by 8 | Viewed by 2627
Abstract
This paper presents an evaluation and comparison of interaction methods for the configuration and visualization of pervasive Augmented Reality (AR) experiences using two different platforms: desktop and mobile. AR experiences consist of the enhancement of real-world environments by superimposing additional layers of information, [...] Read more.
This paper presents an evaluation and comparison of interaction methods for the configuration and visualization of pervasive Augmented Reality (AR) experiences using two different platforms: desktop and mobile. AR experiences consist of the enhancement of real-world environments by superimposing additional layers of information, real-time interaction, and accurate 3D registration of virtual and real objects. Pervasive AR extends this concept through experiences that are continuous in space, being aware of and responsive to the user’s context and pose. Currently, the time and technical expertise required to create such applications are the main reasons preventing its widespread use. As such, authoring tools which facilitate the development and configuration of pervasive AR experiences have become progressively more relevant. Their operation often involves the navigation of the real-world scene and the use of the AR equipment itself to add the augmented information within the environment. The proposed experimental tool makes use of 3D scans from physical environments to provide a reconstructed digital replica of such spaces for a desktop-based method, and to enable positional tracking for a mobile-based one. While the desktop platform represents a non-immersive setting, the mobile one provides continuous AR in the physical environment. Both versions can be used to place virtual content and ultimately configure an AR experience. The authoring capabilities of the different platforms were compared by conducting a user study focused on evaluating their usability. Although the AR interface was generally considered more intuitive, the desktop platform shows promise in several aspects, such as remote configuration, lower required effort, and overall better scalability. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

17 pages, 953 KiB  
Article
Direct and Indirect vSLAM Fusion for Augmented Reality
by Mohamed Outahar, Guillaume Moreau and Jean-Marie Normand
J. Imaging 2021, 7(8), 141; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging7080141 - 10 Aug 2021
Cited by 4 | Viewed by 2210
Abstract
Augmented reality (AR) is an emerging technology that is applied in many fields. One of the limitations that still prevents AR to be even more widely used relates to the accessibility of devices. Indeed, the devices currently used are usually high end, expensive [...] Read more.
Augmented reality (AR) is an emerging technology that is applied in many fields. One of the limitations that still prevents AR to be even more widely used relates to the accessibility of devices. Indeed, the devices currently used are usually high end, expensive glasses or mobile devices. vSLAM (visual simultaneous localization and mapping) algorithms circumvent this problem by requiring relatively cheap cameras for AR. vSLAM algorithms can be classified as direct or indirect methods based on the type of data used. Each class of algorithms works optimally on a type of scene (e.g., textured or untextured) but unfortunately with little overlap. In this work, a method is proposed to fuse a direct and an indirect methods in order to have a higher robustness and to offer the possibility for AR to move seamlessly between different types of scenes. Our method is tested on three datasets against state-of-the-art direct (LSD-SLAM), semi-direct (LCSD) and indirect (ORBSLAM2) algorithms in two different scenarios: a trajectory planning and an AR scenario where a virtual object is displayed on top of the video feed; furthermore, a similar method (LCSD SLAM) is also compared to our proposal. Results show that our fusion algorithm is generally as efficient as the best algorithm both in terms of trajectory (mean errors with respect to ground truth trajectory measurements) as well as in terms of quality of the augmentation (robustness and stability). In short, we can propose a fusion algorithm that, in our tests, takes the best of both the direct and indirect methods. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

18 pages, 3952 KiB  
Article
From IR Images to Point Clouds to Pose: Point Cloud-Based AR Glasses Pose Estimation
by Ahmet Firintepe, Carolin Vey, Stylianos Asteriadis, Alain Pagani and Didier Stricker
J. Imaging 2021, 7(5), 80; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging7050080 - 27 Apr 2021
Cited by 4 | Viewed by 2685
Abstract
In this paper, we propose two novel AR glasses pose estimation algorithms from single infrared images by using 3D point clouds as an intermediate representation. Our first approach “PointsToRotation” is based on a Deep Neural Network alone, whereas our second approach “PointsToPose” is [...] Read more.
In this paper, we propose two novel AR glasses pose estimation algorithms from single infrared images by using 3D point clouds as an intermediate representation. Our first approach “PointsToRotation” is based on a Deep Neural Network alone, whereas our second approach “PointsToPose” is a hybrid model combining Deep Learning and a voting-based mechanism. Our methods utilize a point cloud estimator, which we trained on multi-view infrared images in a semi-supervised manner, generating point clouds based on one image only. We generate a point cloud dataset with our point cloud estimator using the HMDPose dataset, consisting of multi-view infrared images of various AR glasses with the corresponding 6-DoF poses. In comparison to another point cloud-based 6-DoF pose estimation named CloudPose, we achieve an error reduction of around 50%. Compared to a state-of-the-art image-based method, we reduce the pose estimation error by around 96%. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

Review

Jump to: Editorial, Research, Other

20 pages, 721 KiB  
Review
What Is Significant in Modern Augmented Reality: A Systematic Analysis of Existing Reviews
by Athanasios Nikolaidis
J. Imaging 2022, 8(5), 145; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8050145 - 21 May 2022
Cited by 5 | Viewed by 2291
Abstract
Augmented reality (AR) is a field of technology that has evolved drastically during the last decades, due to its vast range of applications in everyday life. The aim of this paper is to provide researchers with an overview of what has been surveyed [...] Read more.
Augmented reality (AR) is a field of technology that has evolved drastically during the last decades, due to its vast range of applications in everyday life. The aim of this paper is to provide researchers with an overview of what has been surveyed since 2010 in terms of AR application areas as well as in terms of its technical aspects, and to discuss the extent to which both application areas and technical aspects have been covered, as well as to examine whether one can extract useful evidence of what aspects have not been covered adequately and whether it is possible to define common taxonomy criteria for performing AR reviews in the future. To this end, a search with inclusion and exclusion criteria has been performed in the Scopus database, producing a representative set of 47 reviews, covering the years from 2010 onwards. A proper taxonomy of the results is introduced, and the findings reveal, among others, the lack of AR application reviews covering all suggested criteria. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

18 pages, 1324 KiB  
Review
A Survey of 6D Object Detection Based on 3D Models for Industrial Applications
by Felix Gorschlüter, Pavel Rojtberg and Thomas Pöllabauer
J. Imaging 2022, 8(3), 53; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8030053 - 24 Feb 2022
Cited by 7 | Viewed by 4185
Abstract
Six-dimensional object detection of rigid objects is a problem especially relevant for quality control and robotic manipulation in industrial contexts. This work is a survey of the state of the art of 6D object detection with these use cases in mind, specifically focusing [...] Read more.
Six-dimensional object detection of rigid objects is a problem especially relevant for quality control and robotic manipulation in industrial contexts. This work is a survey of the state of the art of 6D object detection with these use cases in mind, specifically focusing on algorithms trained only with 3D models or renderings thereof. Our first contribution is a listing of requirements typically encountered in industrial applications. The second contribution is a collection of quantitative evaluation results for several different 6D object detection methods trained with synthetic data and the comparison and analysis thereof. We identify the top methods for individual requirements that industrial applications have for object detectors, but find that a lack of comparable data prevents large-scale comparison over multiple aspects. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

Other

20 pages, 918 KiB  
Systematic Review
Augmented Reality Games and Presence: A Systematic Review
by Anabela Marto and Alexandrino Gonçalves
J. Imaging 2022, 8(4), 91; https://0-doi-org.brum.beds.ac.uk/10.3390/jimaging8040091 - 29 Mar 2022
Cited by 14 | Viewed by 4759
Abstract
The sense of presence in augmented reality (AR) has been studied by multiple researchers through diverse applications and strategies. In addition to the valuable information provided to the scientific community, new questions keep being raised. These approaches vary from following the standards from [...] Read more.
The sense of presence in augmented reality (AR) has been studied by multiple researchers through diverse applications and strategies. In addition to the valuable information provided to the scientific community, new questions keep being raised. These approaches vary from following the standards from virtual reality to ascertaining the presence of users’ experiences and new proposals for evaluating presence that specifically target AR environments. It is undeniable that the idea of evaluating presence across AR may be overwhelming due to the different scenarios that may be possible, whether this regards technological devices—from immersive AR headsets to the small screens of smartphones—or the amount of virtual information that is being added to the real scenario. Taking into account the recent literature that has addressed the sense of presence in AR as a true challenge given the diversity of ways that AR can be experienced, this study proposes a specific scope to address presence and other related forms of dimensions such as immersion, engagement, embodiment, or telepresence, when AR is used in games. This systematic review was conducted following the PRISMA methodology, carefully analysing all studies that reported visual games that include AR activities and somehow included presence data—or related dimensions that may be referred to as immersion-related feelings, analysis or results. This study clarifies what dimensions of presence are being considered and evaluated in AR games, how presence-related variables have been evaluated, and what the major research findings are. For a better understanding of these approaches, this study takes note of what devices are being used for the AR experience when immersion-related feelings are one of the behaviours that are considered in their evaluations, and discusses to what extent these feelings in AR games affect the player’s other behaviours. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

Back to TopTop