Signal processing is a field of engineering and applied mathematics that focuses on analyzing, modifying, and synthesizing signals. A signal can be anything that conveys information, such as sound, images, or sensor readings. The primary goal of signal processing is to extract useful information from signals or transform them to make them more efficient for storage, transmission, or further analysis. This can involve filtering out noise, compressing data, enhancing features, or transforming signals into a different domain (e.g., from time to frequency).
In this problem, you will analyze the trend of Samsung Electronics' stock prices over the past 10 years. The following cell downloads this data and saves it in the samsung
dataframe.
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
from datetime import date, timedelta
samsung = yf.download('005930.KS', start=date.today()-timedelta(10*365))
y = samsung['Adj Close'].values
[*********************100%***********************] 1 of 1 completed
plt.figure(figsize=(10,6), dpi=100)
plt.plot(samsung.index, y)
plt.grid(True)
plt.xlabel('Date')
plt.ylabel('Adjusted close price')
plt.title('Samsung Electronics Co., Ltd. (005930.KS)')
plt.show()
A moving average (rolling average or running average) is a calculation to analyze data points by creating a series of averages of different subsets of the full data set. It is also called a moving mean or rolling mean and is a type of finite impulse response filter.
Given a series of numbers ($y_1,\dots,y_N$) and a fixed subset size $n$, the first element of the moving average is obtained by taking the average of the initial fixed subset of the number series. Then the subset is modified by "shifting forward"; that is, excluding the first number of the series and including the next value in the subset.
We implement a simple moving average $x_t$ of the given signal $y_t$ with window size $n$ as follows:
$$ x_t = \begin{cases} \left( y_t + \cdots + y_{1}\right)/t &\quad \text{if } t\le n,\\ \left( y_t + y_{t-1} + \cdots + y_{t-n+1}\right)/n &\quad \text{otherwise.} \end{cases} $$Various moving average filters for $n=7, 30, 90, 180$ are presented below. What do you observe?
N = len(y)
window = [7, 30, 90, 180]
k = len(window)
x = np.zeros((k,N))
for i in range(k):
n = window[i]
for j in range(N):
start = max(0,j-n+1)
data = y[start:j+1]
n_data = len(data)
#print(start, j, n_data, data)
x[i,j] = np.sum(data)/n_data
plt.figure(figsize=(10,6), dpi=100)
plt.plot(y, alpha=0.4, label='Samsung Electronics')
for i in range(k):
plt.plot(x[i,:], label=f'Moving average with n={window[i]}')
plt.grid()
plt.xlabel('Date')
plt.ylabel('Adjusted close price')
plt.title('Samsung Electronics Co., Ltd. (005930.KS)')
plt.legend()
plt.show()
We have gravity measurements taken at scattered locations on the Earth's surface, and we want to estimate the gravity values at other locations where measurements were not taken.
import numpy as np
import matplotlib.pyplot as plt
# Generate some sample gravity data
num_points = 50
np.random.seed(3001)
x = np.random.rand(num_points)
y = np.random.rand(num_points)
gravity_values = np.sin(2*np.pi*x) + np.cos(2*np.pi*y) \
+ np.random.randn(num_points)*0.1
# Plot the results
plt.figure(figsize=(8, 6), dpi=100)
plt.scatter(x, y, c=gravity_values, label='Gravity Measurements')
plt.colorbar(label='Gravity Anomaly')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Gravity Anomaly Interpolation using IDW')
plt.legend()
plt.grid()
plt.show()
We'll use Inverse Distance Weighting (IDW) interpolation for estimating the gravity at the locations where the measurements were not taken.
IDW interpolation is a simple and widely used method for estimating values at unsampled locations based on nearby measurements. It assumes that the value at an unsampled location is a weighted average of the values at nearby measurement locations, where the weights are inversely proportional to the distance from the measurement locations.
Let's denote:
where the weights $w_i$ are calculated as:
$$ w_i = 1/d_i^p $$The power parameter $p$ typically takes a value of 2, but it can be adjusted to control the influence of distance on the weights. A higher value of $p$ gives more weight to closer measurements, while a lower value of $p$ gives more weight to distant measurements.
The distance $d_i$ between the unsampled location $x$ and the measurement location $x_i$ can be calculated using various distance metrics, such as:
The choice of distance metric depends on the specific application and the nature of the data.
The following shows the results obtained using the Euclidean distance with $p=2$.
# Define the interpolation grid
grid_x, grid_y = np.mgrid[0:1:100j, 0:1:100j]
# Perform IDW interpolation
interpolated_gravity = np.zeros_like(grid_x)
for i in range(grid_x.shape[0]):
for j in range(grid_x.shape[1]):
distances = np.sqrt((grid_x[i, j] - x)**2 + (grid_y[i, j] - y)**2)
weights = 1 / distances**2 # Inverse distance squared weighting
interpolated_gravity[i, j] = np.sum(gravity_values * weights) / np.sum(weights)
#interpolated_gravity[i, j] = np.sin(2*np.pi*grid_x[i,j]) + np.cos(2*np.pi*grid_y[i,j])
# Plot the results
plt.figure(figsize=(8, 6), dpi=100)
plt.scatter(x, y, c=gravity_values, label='Gravity Measurements')
plt.contourf(grid_x, grid_y, interpolated_gravity, alpha=0.7)
plt.colorbar(label='Gravity Anomaly')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Gravity Anomaly Interpolation using IDW')
plt.legend()
plt.grid()
plt.show()