This notebook demonstrates how to "sonify" a simple data set, converting the data to an audio form to help us try to perceive patterns within that data.
The notebook makes use of the python pandas package to represent and visualise a dataset, as well as packages associated with the Jupyter notebook environment that support the audio playback as well as the embedding of interactive widgets that allow us to explore a dataset interactively.
Run the following code cells to set up the notebook environment.
#IPython package for embedding an audio player in a notebook
from IPython.display import Audio
#The ipwidgets.interact package supports interactive widget creation
from ipywidgets import interact
#Configure the notebook to display charts inline
%matplotlib inline
#Import a routine to plot simple datasets
from matplotlib.pyplot import plot
#Import the pandas package for working with tabular datasets
import pandas as pd
#Various packages that provide access to maths functions
import random
import numpy as np
import math
The following examples are based on a sonification notebook published by Doug Blank at Bryn Mawr College.
Let's start with a simple demonstration of how to create and play a simple tone (a sine wave). You do not need to understand how the code works in order to run the code cell.
If you want the audio player to play the tone automatically when the code cell is run, changed the setting from autoplay=False
to autoplay=True
.
Run the code cells below now.
duration = 2 # duration of the tone in seconds
rate = 44100 # sampling rate of the tone
#https://jupyter.brynmawr.edu/services/public/dblank/jupyter.cs/Sonification.ipynb
#This rather complicated looking function creates a function
#It allows us to retrieve the value of a sine wave of a particular frequency at a particular time.
#The frequency is the frequency of the tone in hertz (hz)
def make_tone(frequency):
def f(t):
return math.sin(2 * math.pi * frequency * t)
return f
tone440 = make_tone(440)
#Visualise the tone over a short period (0.1s)
plot([tone440(t) for t in np.arange(0, .01, 1/rate)]);
#Generate the tone and play it through a notebook embedded audio player
Audio([tone440(t) for t in np.arange(0, duration, 1/rate)], rate=rate, autoplay=False)
As well as playing simple tones, we can also generate audio signals of varying frequencies based on data values contained within a dataset.
We will create a couple of helper functions to generate an "audio" dataset from a
# Function to find the value of a sine wave of a given frequency at a particular time
def make_tone2(t, frequency):
return math.sin(2 * math.pi * frequency * t)
#Grab a list of frequency values from a dataset for playback over a given period at a given sampling rate
def makeAudio(data, duration, rate, minf=200, maxf=6000):
audiodata=[]
for t in np.arange(0, duration, 1/rate):
data_index = math.floor(t/duration * len(data))
ratio = (data[data_index] - min(data))/(max(data) - min(data))
audiodata.append(make_tone2(t, ratio * (maxf - minf) + minf))
return audiodata
Generate a sample dataset with several columns, each representing a set frequencies over time:
#Generate the dataset and preview it as a chart
numpoints = 50
df = pd.DataFrame({'x':np.arange(0, numpoints)})
#Generate a column with values proportional to time
df['y'] = df['x'] / numpoints
#Generate a column with values inversely proportional to time
df['y2'] = 1 - df['y']
#Generate a column with values proportional to time with the addition of random noise
df['y3'] = df.apply(lambda x: x['y']+((random.random()/5) - 0.1), axis=1)
ax = df.plot(kind='scatter', x='x', y='y', color='grey')
df.plot(kind='scatter', x='x', y='y2', color='green', ax=ax)
df.plot(kind='scatter', x='x', y='y3', color='red', ax=ax)
ax.set_xlabel('Sample number')
ax.set_ylabel('Value');
minf=200 #Minimum frequency
maxf=6000 #Maximum frequency
duration = 2
#Select a column to visualise and sonify
col = 'y'
df.plot(kind='scatter', x= 'x', y=col);
Audio(makeAudio(df[col], duration, rate, minf, maxf), rate=rate, autoplay=True)
#Example of reordering a list of numbers in a random order
nums = list(range(0,10))
random.shuffle(nums)
nums
[7, 3, 6, 4, 5, 0, 8, 2, 9, 1]
samples = 100
df = pd.DataFrame({'x':np.arange(0,samples)})
nums=list(range(0,samples))
random.shuffle(nums)
df['y'] = nums
df.plot(kind='scatter',x= 'x',y='y');
df.plot(kind='scatter',x= 'x',y='y');
Audio(makeAudio(df['y'], duration, rate), rate=rate, autoplay=True)