Spatial Audio

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Acoustics and Vibrations".

Deadline for manuscript submissions: closed (15 March 2017) | Viewed by 77371

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors

School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Interests: active noise control; adaptive signal processing; psycho-acoustical signal processing; spatial/3D audio processing
Special Issues, Collections and Topics in MDPI journals
School of Electrical Engineering Korea Advanced Institute of Science and Technology, Daejeon, Korea

Special Issue Information

Dear Colleagues,

Three-dimensional (or spatial) audio is a growing research field that plays a key role in realizing immersive communication in many of today’s applications for teleconferencing, entertainment, gaming, navigation guidance, and virtual reality (VR)/augmented reality (AR). Technologies in spatial sound capture and binaural recording are becoming an add-on module to our mobile devices to capture the surrounding soundscape, pickup directional and ambient cues, and create an immersive 3D audio media for playback. We are seeing a surge of research activities and applications that rely on digital spatial audio processing and rendering over loudspeakers (stereo, wave field synthesis, Ambisonics) and headphones, and seeing new emerging fields of mobile spatial audio, personal assisted listening, and spatial audio for VR/AR. New developments in graphical processing unit and multi-core processors are accelerating the pace for real-time spatial audio processing, and new techniques that can lead to high quality and immersive spatial audio reproduction. 

Dr. Woon-Seng Gan
Dr. Jung-Woo Choi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Ambisonics and Wave field Synthesis
  • Assisted Listening
  • Binaural recording/Spatial sound capture and noise control
  • Binaural processing with Head Tracking
  • Head related transfer function and their acquisition
  • Immersive audio for VR/AR
  • Loudspeakers and headphones for sound reproduction
  • Mobile Spatial Audio
  • Novel Applications in Spatial Audio
  • Signal Processing for headphones and loudspeakers
  • Sound Localization in spatial rendering
  • Spatial Audio Coding and Decoding
  • Spatial Rendering and Sound Field Reproduction
  • Real-time implementation and high-performance computing

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review, Other

172 KiB  
Editorial
Guest Editors’ Note—Special Issue on Spatial Audio
by Woon-Seng Gan and Jung-Woo Choi
Appl. Sci. 2017, 7(8), 788; https://0-doi-org.brum.beds.ac.uk/10.3390/app7080788 - 03 Aug 2017
Viewed by 3674
Abstract
Three-dimensional (or spatial) audio is a growing research field that plays a key role in realizing immersive communication in many of today’s applications for teleconferencing, entertainment, gaming, navigation guidance, and virtual reality (VR)/augmented reality (AR).[...] Full article
(This article belongs to the Special Issue Spatial Audio)

Research

Jump to: Editorial, Review, Other

5797 KiB  
Article
Auditory Distance Control Using a Variable-Directivity Loudspeaker
by Florian Wendt, Franz Zotter, Matthias Frank and Robert Höldrich
Appl. Sci. 2017, 7(7), 666; https://0-doi-org.brum.beds.ac.uk/10.3390/app7070666 - 29 Jun 2017
Cited by 12 | Viewed by 4005 | Correction
Abstract
The directivity of a sound source in a room influences the D/R ratio and thus the auditory distance. This study proposes various third-order beampattern pattern designs for a precise control of the D/R ratio. A comprehensive experimental study is conducted to investigate the [...] Read more.
The directivity of a sound source in a room influences the D/R ratio and thus the auditory distance. This study proposes various third-order beampattern pattern designs for a precise control of the D/R ratio. A comprehensive experimental study is conducted to investigate the hereby achieved effect on the auditory distance. Our first experiment auralizes the directivity variations using a virtual directional sound source in a virtual room using playback by a 24-channel loudspeaker ring. The experiment moreover shows the influence of room, source-listener distance, signal, and additional single-channel reverberation on the auditory distance. We verify the practical applicability of all the proposed beampattern pattern designs in a second experiment using a variable-directivity sound source in a real room. Predictions of experimental results are made with high accuracy, using room acoustical measures that typically predict the apparent source width. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

6912 KiB  
Article
Solution Strategies for Linear Inverse Problems in Spatial Audio Signal Processing
by Mingsian R. Bai, Chun Chung, Po-Chen Wu, Yi-Hao Chiang and Chun-May Yang
Appl. Sci. 2017, 7(6), 582; https://0-doi-org.brum.beds.ac.uk/10.3390/app7060582 - 05 Jun 2017
Cited by 11 | Viewed by 4231
Abstract
The aim of this study was to compare algorithms for solving inverse problems generally encountered in spatial audio signal processing. Tikhonov regularization is typically utilized to solve overdetermined linear systems in which the regularization parameter is selected by the golden section search (GSS) [...] Read more.
The aim of this study was to compare algorithms for solving inverse problems generally encountered in spatial audio signal processing. Tikhonov regularization is typically utilized to solve overdetermined linear systems in which the regularization parameter is selected by the golden section search (GSS) algorithm. For underdetermined problems with sparse solutions, several iterative compressive sampling (CS) methods are suggested as alternatives to traditional convex optimization (CVX) methods that are computationally expensive. The focal underdetermined system solver (FOCUSS), the steepest descent (SD) method, Newton’s (NT) method, and the conjugate gradient (CG) method were developed to solve CS problems more efficiently in this study. These algorithms were compared in terms of problems, including source localization and separation, noise source identification, and analysis and synthesis of sound fields, by using a uniform linear array (ULA), a uniform circular array (UCA), and a random array. The derived results are discussed herein and guidelines for the application of these algorithms are summarized. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Graphical abstract

2652 KiB  
Article
Low Frequency Interactive Auralization Based on a Plane Wave Expansion
by Diego Mauricio Murillo Gómez, Jeremy Astley and Filippo Maria Fazi
Appl. Sci. 2017, 7(6), 558; https://0-doi-org.brum.beds.ac.uk/10.3390/app7060558 - 27 May 2017
Cited by 7 | Viewed by 4753
Abstract
This paper addresses the problem of interactive auralization of enclosures based on a finite superposition of plane waves. For this, room acoustic simulations are performed using the Finite Element (FE) method. From the FE solution, a virtual microphone array is created and an [...] Read more.
This paper addresses the problem of interactive auralization of enclosures based on a finite superposition of plane waves. For this, room acoustic simulations are performed using the Finite Element (FE) method. From the FE solution, a virtual microphone array is created and an inverse method is implemented to estimate the complex amplitudes of the plane waves. The effects of Tikhonov regularization are also considered in the formulation of the inverse problem, which leads to a more efficient solution in terms of the energy used to reconstruct the acoustic field. Based on this sound field representation, translation and rotation operators are derived enabling the listener to move within the enclosure and listen to the changes in the acoustic field. An implementation of an auralization system based on the proposed methodology is presented. The results suggest that the plane wave expansion is a suitable approach to synthesize sound fields. Its advantage lies in the possibility that it offers to implement several sound reproduction techniques for auralization applications. Furthermore, features such as translation and rotation of the acoustic field make it convenient for interactive acoustic renderings. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

3053 KiB  
Article
Stereophonic Microphone Array for the Recording of the Direct Sound Field in a Reverberant Environment
by Jonathan Albert Gößwein, Julian Grosse and Steven Van de Par
Appl. Sci. 2017, 7(6), 541; https://0-doi-org.brum.beds.ac.uk/10.3390/app7060541 - 24 May 2017
Cited by 1 | Viewed by 5244
Abstract
State-of-the-art stereo recording techniques using two microphones have two main disadvantages: first, a limited reduction of the reverberation in the direct sound component, and second, compression or expansion of the angular position of sound sources. To address these disadvantages, the aim of this [...] Read more.
State-of-the-art stereo recording techniques using two microphones have two main disadvantages: first, a limited reduction of the reverberation in the direct sound component, and second, compression or expansion of the angular position of sound sources. To address these disadvantages, the aim of this study is the development of a true stereo recording microphone array that aims to record the direct and reverberant sound field separately. This array can be used within the recording and playback configuration developed in Grosse and van de Par, 2015. Instead of using only two microphones, the proposed method combines two logarithmically-spaced microphone arrays, whose directivity patterns are optimized with a superdirective beamforming algorithm. The optimization allows us to have a better control of the overall beam pattern and of interchannel level differences. A comparison between the newly-proposed system and existing microphone techniques shows a lower percentage of the recorded reverberance within the sound field. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

1518 KiB  
Article
Late Reverberation Synthesis Using Filtered Velvet Noise
by Vesa Välimäki, Bo Holm-Rasmussen, Benoit Alary and Heidi-Maria Lehtonen
Appl. Sci. 2017, 7(5), 483; https://0-doi-org.brum.beds.ac.uk/10.3390/app7050483 - 06 May 2017
Cited by 25 | Viewed by 6251
Abstract
This paper discusses the modeling of the late part of a room impulse response by dividing it into short segments and approximating each one as a filtered random sequence. The filters and their associated gain account for the spectral shape and decay of [...] Read more.
This paper discusses the modeling of the late part of a room impulse response by dividing it into short segments and approximating each one as a filtered random sequence. The filters and their associated gain account for the spectral shape and decay of the overall response. The noise segments are realized with velvet noise, which is sparse pseudo-random noise. The proposed approach leads to a parametric representation and computationally efficient artificial reverberation, since convolution with velvet noise reduces to a multiplication-free sparse sum. Cascading of the differential coloration filters is proposed to further reduce the computational cost. A subjective test shows that the resulting approximation of the late reverberation often leads to a noticeable difference in comparison to the original impulse response, especially with transient sounds, but the difference is minor. The proposed method is very efficient in terms of real-time computational cost and memory storage. The proposed method will be useful for spatial audio applications. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

5011 KiB  
Article
Objective Evaluation Techniques for Pairwise Panning-Based Stereo Upmix Algorithms for Spatial Audio
by Martin Mieth and Udo Zölzer
Appl. Sci. 2017, 7(4), 374; https://0-doi-org.brum.beds.ac.uk/10.3390/app7040374 - 10 Apr 2017
Cited by 2 | Viewed by 4200
Abstract
Techniques for generating multichannel audio from stereo audio signals are supposed to enhance and extend the listening experience of the listener. To assess the quality of such upmix algorithms, subjective evaluations have been carried out. In this paper, we propose an objective evaluation [...] Read more.
Techniques for generating multichannel audio from stereo audio signals are supposed to enhance and extend the listening experience of the listener. To assess the quality of such upmix algorithms, subjective evaluations have been carried out. In this paper, we propose an objective evaluation test for stereo-to-multichannel upmix algorithms. Based on defined objective criteria and special test signals, an objective comparative evaluation is enabled in order to obtain a quantifiable measure for the quality of stereo-to-multichannel upmix algorithms. Therefore, the basic functional principle of the evaluation test is demonstrated, and it is illustrated how possible results can be visualized. In addition, the proposed issues are introduced for the optimization of upmix algorithms and also for the clarification and illustration of the impacts and influences of different modes and parameters. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

3304 KiB  
Article
The Reduction of Vertical Interchannel Crosstalk: The Analysis of Localisation Thresholds for Natural Sound Sources
by Rory Wallis and Hyunkook Lee
Appl. Sci. 2017, 7(3), 278; https://0-doi-org.brum.beds.ac.uk/10.3390/app7030278 - 14 Mar 2017
Cited by 12 | Viewed by 5272
Abstract
In subjective listening tests, natural sound sources were presented to subjects as vertically-oriented phantom images from two layers of loudspeakers, ‘height’ and ‘main’. Subjects were required to reduce the amplitude of the height layer until the position of the resultant sound source matched [...] Read more.
In subjective listening tests, natural sound sources were presented to subjects as vertically-oriented phantom images from two layers of loudspeakers, ‘height’ and ‘main’. Subjects were required to reduce the amplitude of the height layer until the position of the resultant sound source matched that of the same source presented from the main layer only (the localisation threshold). Delays of 0, 1 and 10 ms were applied to the height layer with respect to the main, with vertical stereophonic and quadraphonic conditions being tested. The results of the study showed that the localisation thresholds obtained were not significantly affected by sound source or presentation method. Instead, the only variable whose effect was significant was interchannel time difference (ICTD). For ICTD of 0 ms, the median threshold was −9.5 dB, which was significantly lower than the −7 dB found for both 1 and 10 ms. The results of the study have implications both for the recording of sound sources for three-dimensional (3D) audio reproduction formats and also for the rendering of 3D images. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

2426 KiB  
Article
A Measure Based on Beamforming Power for Evaluation of Sound Field Reproduction Performance
by Ji-Ho Chang and Cheol-Ho Jeong
Appl. Sci. 2017, 7(3), 249; https://0-doi-org.brum.beds.ac.uk/10.3390/app7030249 - 03 Mar 2017
Cited by 4 | Viewed by 3545
Abstract
This paper proposes a measure to evaluate sound field reproduction systems with an array of loudspeakers. The spatially-averaged squared error of the sound pressure between the desired and the reproduced field, namely the spatial error, has been widely used, which has considerable problems [...] Read more.
This paper proposes a measure to evaluate sound field reproduction systems with an array of loudspeakers. The spatially-averaged squared error of the sound pressure between the desired and the reproduced field, namely the spatial error, has been widely used, which has considerable problems in two conditions. First, in non-anechoic conditions, room reflections substantially deteriorate the spatial error, although these room reflections affect human localization to a lesser degree. Second, for 2.5-dimensional reproduction of spherical waves, the spatial error increases consistently due to the difference in the amplitude decay rate, whereas the degradation of human localization performance is limited. The measure proposed in this study is based on the beamforming powers of the desired and the reproduced fields. Simulation and experimental results show that the proposed measure is less sensitive to room reflections and the amplitude decay than the spatial error, which is likely to agree better with the human perception of source localization. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

Review

Jump to: Editorial, Research, Other

2430 KiB  
Review
Spatial Audio for Soundscape Design: Recording and Reproduction
by Joo Young Hong, Jianjun He, Bhan Lam, Rishabh Gupta and Woon-Seng Gan
Appl. Sci. 2017, 7(6), 627; https://0-doi-org.brum.beds.ac.uk/10.3390/app7060627 - 16 Jun 2017
Cited by 66 | Viewed by 15438
Abstract
With the advancement of spatial audio technologies, in both recording and reproduction, we are seeing more applications that incorporate 3D sound to create an immersive aural experience. Soundscape design and evaluation for urban planning can now tap into the extensive spatial audio tools [...] Read more.
With the advancement of spatial audio technologies, in both recording and reproduction, we are seeing more applications that incorporate 3D sound to create an immersive aural experience. Soundscape design and evaluation for urban planning can now tap into the extensive spatial audio tools for sound capture and 3D sound rendering over headphones and speaker arrays. In this paper, we outline a list of available state-of-the-art spatial audio recording techniques and devices, spatial audio physical and perceptual reproduction techniques, emerging spatial audio techniques for virtual and augmented reality, followed by a discussion on the degree of perceptual accuracy of recording and reproduction techniques in representing the acoustic environment. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

1396 KiB  
Review
Surround by Sound: A Review of Spatial Audio Recording and Reproduction
by Wen Zhang, Parasanga N. Samarasinghe, Hanchi Chen and Thushara D. Abhayapala
Appl. Sci. 2017, 7(5), 532; https://0-doi-org.brum.beds.ac.uk/10.3390/app7050532 - 20 May 2017
Cited by 74 | Viewed by 16190
Abstract
In this article, a systematic overview of various recording and reproduction techniques for spatial audio is presented. While binaural recording and rendering is designed to resemble the human two-ear auditory system and reproduce sounds specifically for a listener’s two ears, soundfield recording and [...] Read more.
In this article, a systematic overview of various recording and reproduction techniques for spatial audio is presented. While binaural recording and rendering is designed to resemble the human two-ear auditory system and reproduce sounds specifically for a listener’s two ears, soundfield recording and reproduction using a large number of microphones and loudspeakers replicate an acoustic scene within a region. These two fundamentally different types of techniques are discussed in the paper. A recent popular area, multi-zone reproduction, is also briefly reviewed in the paper. The paper is concluded with a discussion of the current state of the field and open problems. Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

Other

691 KiB  
Correction
Correction: Wendt, F.; et al. Auditory Distance Control Using a Variable-Directivity Loudspeaker. Appl. Sci. 2017, 7, 666
by Florian Wendt, Franz Zotter, Matthias Frank and Robert Höldrich
Appl. Sci. 2017, 7(11), 1174; https://0-doi-org.brum.beds.ac.uk/10.3390/app7111174 - 15 Nov 2017
Viewed by 2512
Abstract
We, the authors, wish to make the following corrections to our paper [...] Full article
(This article belongs to the Special Issue Spatial Audio)
Show Figures

Figure 1

Back to TopTop