Notebook

Signal processing¶

$$ \newcommand{\eg}{{\it e.g.}} \newcommand{\ie}{{\it i.e.}} \newcommand{\argmin}{\operatornamewithlimits{argmin}} \newcommand{\mc}{\mathcal} \newcommand{\mb}{\mathbb} \newcommand{\mf}{\mathbf} \newcommand{\minimize}{{\text{minimize}}} \newcommand{\diag}{{\text{diag}}} \newcommand{\cond}{{\text{cond}}} \newcommand{\rank}{{\text{rank }}} \newcommand{\range}{{\mathcal{R}}} \newcommand{\null}{{\mathcal{N}}} \newcommand{\tr}{{\text{trace}}} \newcommand{\dom}{{\text{dom}}} \newcommand{\dist}{{\text{dist}}} \newcommand{\R}{\mathbf{R}} \newcommand{\SM}{\mathbf{S}} \newcommand{\ball}{\mathcal{B}} \newcommand{\bmat}[1]{\begin{bmatrix}#1\end{bmatrix}} \newcommand{\loss}{\ell} \newcommand{\eloss}{\mc{L}} \newcommand{\abs}[1]{| #1 |} \newcommand{\norm}[1]{\| #1 \|} \newcommand{\tp}{T} $$

ASE3001: Computational Experiments for Aerospace Engineering, Inha University.

Jong-Han Kim (jonghank@inha.ac.kr)

Signal processing is a field of engineering and applied mathematics that focuses on analyzing, modifying, and synthesizing signals. A signal can be anything that conveys information, such as sound, images, or sensor readings. The primary goal of signal processing is to extract useful information from signals or transform them to make them more efficient for storage, transmission, or further analysis. This can involve filtering out noise, compressing data, enhancing features, or transforming signals into a different domain (e.g., from time to frequency).

Reading trends from stock prices¶

In this problem, you will analyze the trend of Samsung Electronics' stock prices over the past 10 years. The following cell downloads this data and saves it in the samsung dataframe.

In [1]:

import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
from datetime import date, timedelta

samsung = yf.download('005930.KS', start=date.today()-timedelta(10*365))
y = samsung['Adj Close'].values

[*********************100%***********************]  1 of 1 completed

In [2]:

plt.figure(figsize=(10,6), dpi=100)
plt.plot(samsung.index, y)
plt.grid(True)
plt.xlabel('Date')
plt.ylabel('Adjusted close price')
plt.title('Samsung Electronics Co., Ltd. (005930.KS)')
plt.show()

Moving average filter¶

A moving average (rolling average or running average) is a calculation to analyze data points by creating a series of averages of different subsets of the full data set. It is also called a moving mean or rolling mean and is a type of finite impulse response filter.

Given a series of numbers ($y_1,\dots,y_N$) and a fixed subset size $n$, the first element of the moving average is obtained by taking the average of the initial fixed subset of the number series. Then the subset is modified by "shifting forward"; that is, excluding the first number of the series and including the next value in the subset.

We implement a simple moving average $x_t$ of the given signal $y_t$ with window size $n$ as follows:

$$ x_t = \begin{cases} \left( y_t + \cdots + y_{1}\right)/t &\quad \text{if } t\le n,\\ \left( y_t + y_{t-1} + \cdots + y_{t-n+1}\right)/n &\quad \text{otherwise.} \end{cases} $$

Various moving average filters for $n=7, 30, 90, 180$ are presented below. What do you observe?

In [3]:

N = len(y)

window = [7, 30, 90, 180]

k = len(window)

x = np.zeros((k,N))

for i in range(k):
  n = window[i]
  for j in range(N):
    start = max(0,j-n+1)
    data = y[start:j+1]
    n_data = len(data)
    #print(start, j, n_data, data)
    x[i,j] = np.sum(data)/n_data

plt.figure(figsize=(10,6), dpi=100)
plt.plot(y, alpha=0.4, label='Samsung Electronics')
for i in range(k):
  plt.plot(x[i,:], label=f'Moving average with n={window[i]}')
plt.grid()
plt.xlabel('Date')
plt.ylabel('Adjusted close price')
plt.title('Samsung Electronics Co., Ltd. (005930.KS)')
plt.legend()
plt.show()

Gravity anomaly interpolation¶

We have gravity measurements taken at scattered locations on the Earth's surface, and we want to estimate the gravity values at other locations where measurements were not taken.

In [4]:

import numpy as np
import matplotlib.pyplot as plt

# Generate some sample gravity data
num_points = 50
np.random.seed(3001)
x = np.random.rand(num_points)
y = np.random.rand(num_points)
gravity_values = np.sin(2*np.pi*x) + np.cos(2*np.pi*y) \
  + np.random.randn(num_points)*0.1

# Plot the results
plt.figure(figsize=(8, 6), dpi=100)
plt.scatter(x, y, c=gravity_values, label='Gravity Measurements')
plt.colorbar(label='Gravity Anomaly')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Gravity Anomaly Interpolation using IDW')
plt.legend()
plt.grid()
plt.show()

Inverse distance weighting (IDW)¶

We'll use Inverse Distance Weighting (IDW) interpolation for estimating the gravity at the locations where the measurements were not taken.

IDW interpolation is a simple and widely used method for estimating values at unsampled locations based on nearby measurements. It assumes that the value at an unsampled location is a weighted average of the values at nearby measurement locations, where the weights are inversely proportional to the distance from the measurement locations.

Let's denote:

$u(x)$ : The value to be estimated at an unsampled location $x$.
$u_i$ : The measured value at location $x_i$.
$d_i$ : The distance between the unsampled location $x$ and the measurement location $x_i$.
$w_i$ : The weight assigned to the measurement at location $x_i$.
$p$ : A power parameter that controls the influence of distance on the weights.

Then the IDW interpolation formula is given by: $$ u(x) = \frac{\sum_i{w_i u_i}}{\sum_i {w_i}} $$

where the weights $w_i$ are calculated as:

$$ w_i = 1/d_i^p $$

The power parameter $p$ typically takes a value of 2, but it can be adjusted to control the influence of distance on the weights. A higher value of $p$ gives more weight to closer measurements, while a lower value of $p$ gives more weight to distant measurements.

The distance $d_i$ between the unsampled location $x$ and the measurement location $x_i$ can be calculated using various distance metrics, such as:

Euclidean distance (2-norm): $d_i = \sqrt{(x - x_i)^2 + (y - y_i)^2}$
Manhattan distance (1-norm): $d_i = |x - x_i| + |y - y_i|$

The choice of distance metric depends on the specific application and the nature of the data.

The following shows the results obtained using the Euclidean distance with $p=2$.

In [5]:

# Define the interpolation grid
grid_x, grid_y = np.mgrid[0:1:100j, 0:1:100j]

# Perform IDW interpolation
interpolated_gravity = np.zeros_like(grid_x)
for i in range(grid_x.shape[0]):
    for j in range(grid_x.shape[1]):
        distances = np.sqrt((grid_x[i, j] - x)**2 + (grid_y[i, j] - y)**2)
        weights = 1 / distances**2  # Inverse distance squared weighting
        interpolated_gravity[i, j] = np.sum(gravity_values * weights) / np.sum(weights)
        #interpolated_gravity[i, j] = np.sin(2*np.pi*grid_x[i,j]) + np.cos(2*np.pi*grid_y[i,j])

# Plot the results
plt.figure(figsize=(8, 6), dpi=100)
plt.scatter(x, y, c=gravity_values, label='Gravity Measurements')
plt.contourf(grid_x, grid_y, interpolated_gravity, alpha=0.7)
plt.colorbar(label='Gravity Anomaly')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Gravity Anomaly Interpolation using IDW')
plt.legend()
plt.grid()
plt.show()