Reading and Writing Audio Files with the `soundfile` module¶

back to overview page

There are many libraries for handling audio files with Python (see overview page), but the best one is probably the soundfile module.

Full documentation including installation instructions is available at http://pysoundfile.readthedocs.org/.

Advantages:

supports many file formats (thanks to libsndfile)
- WAV, OGG, FLAC and many more
- see bottom of this notebook for full list of supported formats
supports 24-bit PCM and 32-bit floating point WAV files
WAVEX support
can read parts of audio files
automatic type conversion and normalization
works in CPython 2.x and 3.x and in PyPy as well
provides audio data as NumPy arrays by default, but it can also work with plain Python buffer objects if NumPy is not available

Disadvantages:

no MP3 support

Installation:

python3 -m pip install soundfile

Reading¶

This is the quickest way to load a WAV file into a NumPy array (using soundfile.read()):

In [ ]:

import soundfile as sf
sig, samplerate = sf.read('data/test_wav_pcm16.wav')

That's all. Easy, isn't it?

But let's have a closer look ...

The test file is not a very typical file, because it only has 15 frames but it has 7 channels:

In [ ]:

sig.shape

In [ ]:

samplerate

Let's check the contents of the file by plotting thw audio waveform:

In [ ]:

import matplotlib.pyplot as plt

In [ ]:

plt.plot(sig);

Looking good!

In most cases soundfile.read() is all you need, but for some advanced use cases, you might want to use a soundfile.SoundFile object instead:

In [ ]:

f = sf.SoundFile('data/test_wav_pcm16.wav')

In [ ]:

len(f), f.channels, f.samplerate

In [ ]:

f.format, f.subtype, f.endian

In [ ]:

test = f.read()
test.shape

In [ ]:

plt.plot(test);

In [ ]:

(test == sig).all()

As you can see, you get the same data as with sf.read().

In [ ]:

# TODO: read mono file
# mono data is by default returned as one-dimensional NumPy array,
# this can be changed with always_2d=True

24-bit files work:

In [ ]:

sig, samplerate = sf.read('data/test_wav_pcm24.wav')
plt.plot(sig);

WAVEX is supported:

In [ ]:

sig, samplerate = sf.read('data/test_wavex_pcm16.wav')
plt.plot(sig);

In [ ]:

sig, samplerate = sf.read('data/test_wavex_pcm24.wav')
plt.plot(sig);

32-bit float files work:

In [ ]:

sig, samplerate = sf.read('data/test_wav_float32.wav')
plt.plot(sig);

In [ ]:

sig, samplerate = sf.read('data/test_wavex_float32.wav')
plt.plot(sig);

Writing¶

Writing audio data to a file (using soundfile.write()) is as simple as reading from a file:

In [ ]:

sf.write('my_pcm16_file.wav', sig, samplerate)

Let's check if this file has really been written:

In [ ]:

!sndfile-info my_pcm16_file.wav

Note that by default, WAV files are written as 16-bit fixed point data (a.k.a. 'PCM_16'). You can find the default setting for each file format with soundfile.default_subtype():

In [ ]:

sf.default_subtype('WAV')

If you want to save your file with a better quality setting (especially if you want to do further processing later), you can, for example, use the 32-bit floating point format:

In [ ]:

sf.write('my_float_file.wav', sig, samplerate, subtype='FLOAT')

Let's check if this worked:

In [ ]:

!sndfile-info my_float_file.wav

You can get all available subtypes for a given format with soundfile.available_subtypes():

In [ ]:

sf.available_subtypes('WAV')

Available Formats¶

You can get all available formats with soundfile.available_formats():

In [ ]:

sf.available_formats()

... and all available subtypes with soundfile.available_subtypes():

In [ ]:

sf.available_subtypes()

Version Info¶

In [ ]:

print("PySoundFile version:", sf.__version__)

import sys
print("Python version:", sys.version)

To the extent possible under law, the person who associated CC0 with this work has waived all copyright and related or neighboring rights to this work.

Reading and Writing Audio Files with the soundfile module¶

Reading¶

Writing¶

Available Formats¶

Version Info¶

Reading and Writing Audio Files with the `soundfile` module¶