#!/usr/bin/env python # coding: utf-8 # Sascha Spors, # Professorship Signal Theory and Digital Signal Processing, # Institute of Communications Engineering (INT), # Faculty of Computer Science and Electrical Engineering (IEF), # University of Rostock, # Germany # # # Data Driven Audio Signal Processing - A Tutorial with Computational Examples # # Winter Semester 2023/24 (Master Course #24512) # # - lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture # - tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise # # Feel free to contact lecturer frank.schultz@uni-rostock.de # # Exercise 1: Introduction to DDASP # We introduce the topic and set general objectives for this tutorial. We have some thoughts on best engineering practices and discuss the established procedure for structured development of data-driven methods. Useful Python packages are stated. Exemplary machine learning based audio applications are briefly outlined. # ## General Objective # # - For engineers **understanding the essence** of a concept is more important than a strict math proof # - as engineers we can leave proofs to mathematicians # - *example*: understanding the 4 matrix subspaces and the matrix (pseudo)-inverse based on the SVD is essential and need to know, in-depth proofs on this fundamental topic is nice to have # - We should # - understand building blocks of machine learning for audio data processing # - create simple tool chains from these building blocks # - create simple applications from these tool chains # - get an impression about real industrial applications and their algorithmic and data effort # - get in touch with scientific literature # - where to find, how to read # - there we will find latest tool chain inventions (if published at all, a lot of stuff is either unavailable due to company secrets, or only patent specifications exist, which usually omit heavy math and important details) # - interpretation of results # - reproducibility # - re-inventing a tool chain # - get in touch with major software libraries (in Python), see below # ## Useful Python Packages # # - `numpy` for matrix/tensor algebra # - `scipy` for important science math stuff # - `matplotlib` for plotting # - `scikit-learn` for predictive data analysis, machine learning # - `statsmodels` statistic models, i.e. machine learning driven from statistics community # - `tensorflow` deep learning with DNNs, CNNs... # - `keras-tuner` for convenient hyper parameter tuning # - `pytorch` deep learning with DNNs, CNNs...audio handling # - `pandas` for data handling # # audio related packages that we might use here and there # - `librosa`+`ffmpeg` music/audio analysis + en-/decoding/stream support # - pip: # - sounddevice # - soundfile # - pyloudnorm # ## Best Engineering Practice # # - engineering is about creating (hopefully useful) tools by using existing tools # - models are tools and thus perfectly fit to the engineering community # - we should better know our used and created tools in very detail # - aspects on responsibility, ethics, moral # - substantially reflecting our engineering task before starting is a good idea # - critical reflection (higher good vs. earning money) # - do we really need machine support here # - if so, how can machines support us here, how do humans solve this task # - what do machines better here than humans and vice versa # - what is our expectation of the model perfomance # - handcrafted model vs. machine learned model (problem: model transparency) # - ... # ## Established Procedure # for structured development of data-driven methods (cf. the lecture) # # 1. Definition of the problem and of performance measures # 2. Data preparation and feature extraction # 3. Spot check potential model architectures # 4. Model selection # 5. Evaluation and reporting # 6. Application # # If we lack on thinking about 1. and 2., we will almost certainly under-perform in 3. and 4., which directly affects 5. and 6. Thus, we really should take the whole chain seriously. We hopefully do this all the time in the lecture and exercise. # ## Applications for Machine Learning in Audio # # Some examples for applications are given below. Nowadays industrial applications use a combination of different ML techniques to provide an intended consumer service. # # - supervised learning (mostly prediction by clustering / regression) # - query by humming # - music/genre recognition & recommendation # - speech recognition # - disease prediction by sound analysis of breathing / coughing # - acoustic surveillance of machines (cd. keyboard noise to text?!) # - gun shot / alert sound detection # - beam forming / direction of arrival (DOA) # - composing (cf. Beethoven Symphony Nr. 10) # - deep audio fakes (human-made vs. machine-made replica) # - Auto EQ (mix should sound as reference mix?!) # - unsupervised learning (mostly clustering, dimensionality reduction) # - noise reduction # - echo cancellation # - feedback cancellation # - speech / language recognition # - compression # - feature creation (typical spectrum of pop music, classical...) # - feature calculation (perceived loudness, cf. replay gain adaption) # - key recognition # - reinforcement learning # - human tasks: how to compose a hit single, how to mix a hit single # ## Ideas for Student Projects # # - song recognition (recognize a song out of a data base) # - key recognition (recognize the key a song is written in) # - chord recognition (recognize simple chords and chord progressions) # - de-noising (reduce noise in audio material, for example to improve speech intelligibility) # - genre classification and recommendation service # ## Copyright # # - the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources) # - the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) # - the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT) # - feel free to use the notebooks for your own purposes # - please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year.