Machine Understanding of Music and Sound

A special issue of Algorithms (ISSN 1999-4893).

Deadline for manuscript submissions: closed (30 June 2022) | Viewed by 4425

Special Issue Editor

Computer Science, Oregon State University, Corvallis, OR 97702, USA
Interests: deep learning; non-speech audio; educational data mining; large imbalanced datasets; machine learning in the musical domain

Special Issue Information

Dear Colleagues,

In a world interconnected by accessible data and smart applications, the need for intelligent methods to understand music and sound continues to grow. From computational musicology to music information retrieval, and from emotion recognition to bioacoustic understanding, data-driven algorithms are increasingly employed to analyze, interpret, and generate sound. These diverse applications include the analysis and transformation of sound, detection and classification of audio events, representation and sonification of data, bioacoustic analysis and interpretation, music recommendation and search systems, recognition of emotion or musical genre, automatic transcription and song recognition, and creative endeavors such as composition and sound synthesis.

While new advances in theoretical machine learning continue to be applied to attempt to answer research questions in the domain of non-speech audio, many challenges remain. Some of these challenges include the limited availability of annotated data, complications in the presence of noise, poor generalizability between data sets, difficulties interpreting trained neural network models, and the absence of performance metrics to evaluate creative endeavors. To address these challenges, ongoing research continues to develop new methods and techniques, leverage multimodal information, integrate deep learning and human perception, and apply existing machine learning techniques to novel applications in music and sound.

This Special Issue on “Machine Understanding of Music and Sound” calls for manuscripts proposing novel machine learning and deep learning methods, approaches, and applications that advance a computational understanding of music and sound.

Dr. Patrick Donnelly
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • classification of music/audio
  • music recognition
  • genre classification
  • style recognition
  • music similarity
  • music recommendation systems
  • music sentiment analysis
  • music transcription
  • score alignment
  • expressive performance modeling
  • musical style transfer
  • music acoustics
  • sound synthesis
  • algorithmic composition
  • intelligent signal processing
  • source separation
  • evaluation metrics
  • bioacoustics

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 4449 KiB  
Article
Optical Medieval Music Recognition Using Background Knowledge
by Alexander Hartelt and Frank Puppe
Algorithms 2022, 15(7), 221; https://0-doi-org.brum.beds.ac.uk/10.3390/a15070221 - 22 Jun 2022
Cited by 1 | Viewed by 1574
Abstract
This paper deals with the effect of exploiting background knowledge for improving an OMR (Optical Music Recognition) deep learning pipeline for transcribing medieval, monophonic, handwritten music from the 12th–14th century, whose usage has been neglected in the literature. Various types of background knowledge [...] Read more.
This paper deals with the effect of exploiting background knowledge for improving an OMR (Optical Music Recognition) deep learning pipeline for transcribing medieval, monophonic, handwritten music from the 12th–14th century, whose usage has been neglected in the literature. Various types of background knowledge about overlapping notes and text, clefs, graphical connections (neumes) and their implications on the position in staff of the notes were used and evaluated. Moreover, the effect of different encoder/decoder architectures and of different datasets for training a mixed model and for document-specific fine-tuning based on an extended OMR pipeline with an additional post-processing step were evaluated. The use of background models improves all metrics and in particular the melody accuracy rate (mAR), which is based on the insert, delete and replace operations necessary to convert the generated melody into the correct melody. When using a mixed model and evaluating on a different dataset, our best model achieves without fine-tuning and without post-processing a mAR of 90.4%, which is raised by nearly 30% to 93.2% mAR using background knowledge. With additional fine-tuning, the contribution of post-processing is even greater: the basic mAR of 90.5% is raised by more than 50% to 95.8% mAR. Full article
(This article belongs to the Special Issue Machine Understanding of Music and Sound)
Show Figures

Figure 1

21 pages, 1169 KiB  
Article
Large-Scale Multimodal Piano Music Identification Using Marketplace Fingerprinting
by Daniel Yang, Arya Goutam, Kevin Ji and TJ Tsai
Algorithms 2022, 15(5), 146; https://0-doi-org.brum.beds.ac.uk/10.3390/a15050146 - 26 Apr 2022
Cited by 3 | Viewed by 1975
Abstract
This paper studies the problem of identifying piano music in various modalities using a single, unified approach called marketplace fingerprinting. The key defining characteristic of marketplace fingerprinting is choice: we consider a broad range of fingerprint designs based on a generalization of standard [...] Read more.
This paper studies the problem of identifying piano music in various modalities using a single, unified approach called marketplace fingerprinting. The key defining characteristic of marketplace fingerprinting is choice: we consider a broad range of fingerprint designs based on a generalization of standard n-grams, and then select the fingerprint designs at runtime that are best for a specific query. We show that the large-scale retrieval problem can be framed as an economics problem in which a consumer and a store interact. In our analogy, the runtime search is like a consumer shopping in the store, the items for sale correspond to fingerprints, and purchasing an item corresponds to doing a fingerprint lookup in the database. Using basic principles of economics, we design an efficient marketplace in which the consumer has many options and adopts a rational buying strategy that explicitly considers the cost and expected utility of each item. We evaluate our marketplace fingerprinting approach on four different sheet music retrieval tasks involving sheet music images, MIDI files, and audio recordings. Using a database containing approximately 375,000 pages of sheet music, our method is able to achieve 0.91 mean reciprocal rank with sub-second average runtime on cell phone image queries. On all four retrieval tasks, the marketplace method substantially outperforms previous methods while simultaneously reducing average runtime. We present comprehensive experimental results, as well as detailed analyses to provide deeper intuition into system behavior. Full article
(This article belongs to the Special Issue Machine Understanding of Music and Sound)
Show Figures

Figure 1

Back to TopTop