This is the Jupyter Notebook for analyzing Sunrise Festival data (www.hamsci.org/sunrisefest). It was developed by Kristina Collins KD8OXT, based on work by Cuong Nguyen (ORCID 0000-0002-3769-7556) and other students at the University of Scranton.
In this notebook, you'll compare your recorded signal to a template of the signal*. By finding the cross-correlation between parts of the template and your recording, you will identify the time at which you observed each part, and the time between parts. You will also look for evidence of multipath propagation and identify the delay between paths. After you submit your results, they will be combined with other submissions to look for the effects of sunrise on propogation around the world.
* The template signal is actually the same audio file used at the source transmitters WWV and WWVH! You can download and experiment with the files yourself at https://zenodo.org/record/5602094
Note: If you are running this notebook in Binder, your changes will not be saved. To make significant changes, you should download a local version.
Let's begin!
$\color{red}{\text{TODO}}$: Welcome! Input your file parameters here, then run the notebook.
In particular, here's where you should the filename of your audio or IQ file, and whether your file needs demodulation before it is processed. You should also provide some information about the station where the data was collected by editing the other variables.
# This is the special cell for papermill.
# File path to the collected signal
input_filename = "N6GN_20211115T190749_iq_15.wav" # type: str
# input_filename = "w2naf.com_2021-11-15T19_07_36Z_10000.00_iq.wav"
# input_filename = "N6GN_20211115T190749_am_15.wav"
# Does the collected signal require AM demodulation?
# If your signal is an IQ file, set this to True, If it is an AM file, set this to False.
input_requires_demodulation = True # type: bool
# latitude in decimal degrees (+ is North and - is South)
lat = 41.50 # type: float
# longitude in decimal degrees (+ is East and - is West)
lon = -81.61 # type: float
# a short string describing the radio used to make the recording
radio = "ICOM 7600" # type: str
# a short string describing the antenna used to make the recording
antenna = "half-wave dipole" # type: str
# --- Additional parameters ---
# Most people won't need to change these.
# File path to the template signal
fname = "test.wav" # type: str
# Do you want a custom sample rate as opposed to the value given by the wav files themselves?
# Leave fs_custom = -1 if the answer is NO
fs_custom = -1 # type: int
First, we'll make sure the requisite packages are installed.
If you are running this on Binder, the packages will be installed automatically.
If you are not running on Binder, install the Python packages from requirements.txt
(i.e. pip install -r requirements.txt
).
Next, we'll import the packages we need.
from os.path import splitext
import json
import numpy as np
import matplotlib.pyplot as plt
import scipy.signal
from scipy.io import wavfile
If you're interested, all functions defined for this notebook can be found in the sunrisefest_module.py
file.
from sunrisefest_module import *
# Matplotlib settings to make the plots look a little nicer.
%matplotlib inline
plt.rcParams['font.size'] = 18
plt.rcParams['font.weight'] = 'bold'
plt.rcParams['axes.grid'] = True
plt.rcParams['axes.xmargin'] = 0
plt.rcParams['grid.linestyle'] = ':'
plt.rcParams['figure.figsize'] = (10,6)
# create an empty array to hold the results we find
results = []
def record(name: str, description: str, value, unit: str):
'''Record a result of our analysis'''
print(description.format(value))
results.append({
"name": name,
"description": description,
"value": value,
"unit": unit})
def save_results():
'''Save the results to a file'''
output_file = splitext(input_filename)[0] + "-results.json"
print(f"writing to file {output_file}")
with open(output_file, 'w') as file:
json.dump({
"Input filename": input_filename,
"parsed info": parse(input_filename),
"fs_custom": fs_custom,
"modulation": input_requires_demodulation,
"original sample rate": fs_collected,
"upsampled sample rate": fs_upsampled_collected,
"latitude": lat,
"longitude": lon,
"results": results
}, file, indent=2)
Before we load your recording, we'll load the template signal and extract a few key parts of the template. We'll save each of these parts into variables so that we can compare them to your recording later.
# Load in the file and detect the sampling frequency
fs_template, template_signal = wavfile.read(fname)
print('Sample Rate: {!s} samples/sec'.format(fs_template))
# Create a time vector
t_template = np.arange(len(template_signal)) * (1./fs_template)
Sample Rate: 48000 samples/sec
Using the sounddevice library, we can hear what the template sounds like if played as
play(template_signal, fs_template)
Next, normalize the data which rescales the signal to be bounded between -1 and 1
normalized_template = norm(template_signal)
normalized_template
array([-0.00354015, -0.00399792, -0.00436415, ..., 0. , 0. , 0. ])
Let's generate a time-series plot and a spectrogram to observe the signal visually
plot_signal(t_template, normalized_template, title="Manufactured Signal")
/home/pliny/.local/lib/python3.6/site-packages/matplotlib/axes/_axes.py:7553: RuntimeWarning: divide by zero encountered in log10
Here we'll extract one of the white Gaussian noise bursts. This extracted signal will be used later to identify the timing of white Gaussian noise in your collected data.
white_noise, t_white_noise = extract(normalized_template, 10, 12, fs_template)
Similarly, we'll also extract chirps from the template signal.
chirps, t_chirps = extract(normalized_template, 24, 32, fs_template)
Now that we have our templates, we will load your recording.
# Note that IQ WAV files look like regular stereo WAV files, but instead of
# the channels representing the left and right speakers, they represent the
# I and Q signals.
fs_collected, iq = wavfile.read(input_filename)
t_collected = np.arange(len(iq))*(1./fs_collected)
print('Sample Rate: {!s} samples/sec'.format(fs_collected))\
if input_requires_demodulation:
print('Number of Channels: {!s}'.format(iq.shape[1]))
print('Data Type: {!s}'.format(iq.dtype))
Sample Rate: 20249 samples/sec Number of Channels: 2 Data Type: float32
iq_float = iq / (np.max(np.abs(iq))+1.0)
# If this signal requires AM demodulation, do that:
if input_requires_demodulation:
collected_signal = np.sqrt(iq_float[:,0]**2 + iq_float[:,1]**2)
else:
collected_signal = iq_float
Let's listen to the file we've imported:
play(collected_signal, fs_collected)
Normalize the data
normalized_collected = norm(collected_signal)
plot_signal(t_collected, normalized_collected, title=input_filename)
By removing the DC offset, the collected signal is centered around 0.
centered_collected = dco(collected_signal)
plot_signal(t_collected, centered_collected, title='DC-Removed '+input_filename)
The template signal and collected signal may have different sampling rates. That is not ideal for cross-correlation. In that case, the algorithm will essentially be comparing two signals where one is compressed or stretched in time with respect to the other.
Therefore, let us resample the collected signal to make sure that the sampling rates match.
new_number_of_samples = int(len(centered_collected)/fs_collected*fs_template)
print("Total number of samples in the collected signal: ",new_number_of_samples)
Total number of samples in the collected signal: 3545187
fs_upsampled_collected = fs_template
print("New sampling rate of collected signal: {:f} samples per second".format(fs_upsampled_collected))
New sampling rate of collected signal: 48000.000000 samples per second
upsampled_collected, t_upsampled_collected = scipy.signal.resample(centered_collected, new_number_of_samples, t=t_collected)
plot_signal(t_upsampled_collected, upsampled_collected, title='Resampled Received Data')
First, the algorithm will identify the timing of the chirps. It is achieved by cross-correlating the template chirps with your collected signal. The timing of the chirps is the time when we find the maximum value of correlation. Then, we'll be able to divide the search for the white Gaussian noise bursts to before and after the chirps. From here, we will follow the same procedure as we have done to identify the chirps.
Let's cross-correlate the template chirps with our entire recording to find the superimposed chirps:
t_upsampled_chirps = find_timing_of(chirps, upsampled_collected, fs_upsampled_collected)
print("Start Time of the Chirps relative to the beginning of the recording: {:f} seconds".format(t_upsampled_chirps))
Start Time of the Chirps relative to the beginning of the recording: 34.307458 seconds
We know that the first white noise burst happens before the chirps. As described previously, we will restrict our search for the first white Gaussian noise from the beginning of the recording up to the timing of the chirps which was found in the previous step.
collected_extract_for_t1, t_collected_extract_for_t1 = extract(upsampled_collected, 'start', t_upsampled_chirps, fs_upsampled_collected)
collected_extract_for_t1
array([-0.01719864, -0.05533837, -0.03759671, ..., -0.07412368, -0.013604 , 0.0515242 ], dtype=float32)
t1 = find_timing_of(white_noise, collected_extract_for_t1, fs_upsampled_collected)
This variable, t1
, is our first finding. We'll save it for later using the record
helper function. (At the end of this notebook, we'll write everything saved with record
to a file that you can submit along with your raw recording.)
record('t1', 'Start time of the first white Gaussian noise with respect to the beginning of the recording: {:f} seconds', t1, 'seconds')
Start time of the first white Gaussian noise with respect to the beginning of the recording: 20.307375 seconds
It is useful to find the timing of the chirps with respect to the timing of the first white noise burst, so we will save it for later as well.
t_chirps_wrt_t1 = t_upsampled_chirps - t1
record(
't_chirps_wrt_t1',
"Start Time of the Chirps with respect to Start Time of the First Noise: {:f} seconds",
t_chirps_wrt_t1,
'seconds')
print("Start Time of the Chirps with respect to Start Time of the First Noise: {:f} seconds".format(t_upsampled_chirps + t1))
Start Time of the Chirps with respect to Start Time of the First Noise: 14.000083 seconds Start Time of the Chirps with respect to Start Time of the First Noise: 54.614833 seconds
Now we'll find how long after the first white noise the second noise starts.
Once again, we can restrict our search to after the timing of the chirps found in the first step.
# Collected signal extracted without the portion before the end of the chirps (approximately
collected_extract_for_t2, t_collected_extract_for_t2 = extract(upsampled_collected, t_upsampled_chirps, 'end', fs_upsampled_collected)
t2 = find_timing_of(white_noise, collected_extract_for_t2, fs_upsampled_collected) + t_chirps_wrt_t1
record(
't2',
"Start time of the second white Gaussian noise with respect to the first white Gaussian noise: {:f} seconds.",
t2,
"seconds")
Start time of the second white Gaussian noise with respect to the first white Gaussian noise: 27.000146 seconds.
We can also find the time of second burst relative to the beginning of the recording by adding the times t1
and t2
:
print("Start time of the second white Gaussian noise with respect to the beginning of the recording: {:f} seconds.".format(t2 + t1))
Start time of the second white Gaussian noise with respect to the beginning of the recording: 47.307521 seconds.
Let's investigate the correlation plot between the collected signal and the template chirps, and save the plot to a file.
R_chirps, tau_chirps = crosscorrelate(upsampled_collected, chirps, fs_upsampled_collected)
fig = plot_correlation(tau_chirps, R_chirps, title='Cross-Correlation between Template Chirps and Signal between Noises\nsignal: {}'.format(input_filename), return_figure=True)
[ could we make a clearer illustration for the correlations ? ]
[In what sense do you want the plot to be clearer? I would love to hear any recommendation.]
fig.savefig(splitext(input_filename)[0] + "-results.png", dpi=300, facecolor='white')
In addition to the five large peaks, you might (or might not) see smaller peaks that are delayed slightly. If you can see them, they are likely evidence that you received the signal along more than one propagation path.
multipath = False
record('multipath', 'user saw multipath? {}', multipath, 'yes/no')
user saw multipath? False
Let's make an interactive version of this plot so we can zoom in and identify the location of the peaks.
Uncomment the following line of code to generate an interactive plot.
Uncomment Python code by removing the '#' before a line.
R_peaks, t_peaks = plot_correlation_interactive(tau_chirps, R_chirps, title='Cross-correlation')
Peaks found at [32.107458333333334, 33.207458333333335, 34.30745833333333, 35.40745833333333, 36.50745833333333]
Zoom in to the area with the five peaks. How many different propagation paths can you see?
Set the variable to the number of paths you see. Remember, a group of five large peaks by themselves means you heard only one path. If the large peaks are each followed by one smaller 'echo', you heard a second path. If there were two 'echos', you heard 3 paths, etc.
number_of_paths = 1
record('number_of_paths', 'The user was able to identify {} paths', number_of_paths, 'count')
We'll next save the peaks that were found in the cross-correlation (the red plus marks).
record('R_peaks', 'Array of amplitudes of peaks: {}', R_peaks, 'unitless (cross-correlation)')
record('t_peaks', 'Array of times of peaks: {}', t_peaks, 'seconds')
[ the peak finding method needs tuning. ]
[Work in progress]
Finally, write the results we've been saving with the record
function to a file:
save_results()
That's it! Along with your recording, submit both the .json
file with your numerical results and the .png
file with your plot of the chirp correlations.
Upload your files here: https://cwru.app.box.com/f/e2744d4039dd4482bb8722a0fb27bd73