#!/usr/bin/env python # coding: utf-8 # # Unit root test and Hurst exponent # #

# ## Overview # # - We will first recap and provide more details about some concepts we saw in the previous lectures. # - Then, we will introduce new analytics tools and some of their application in financial applications. # - The content of the lecture is organized as follows: # 1. Small recap. # 2. Unit root test. # 3. Mean reversion. # 4. Hurst Exponent. # 5. Geometric Brownian motion. # In[2]: # Imports import warnings warnings.simplefilter(action="ignore", category=(FutureWarning, DeprecationWarning)) import matplotlib.pyplot as plt import numpy as np import pandas as pd from statsmodels.tsa.stattools import adfuller import yfinance as yf np.random.seed(0) # For reproducible results # --- # ## Small recap # # - We've covered lots of material in the previous lessons. # - Now is a good time to step back and rehash what we've covered. # ### Decomposition # # - Time series data can be decomposed into trend, seasonal, and random fluctuation components. # # **Trends** # - Increasing. # - Decreasing. # - Flat. # - Larger trends can be made up of smaller trends. # - There is no defined timeframe for what constitutes a trend: it depends on your data and task at hand. # **Seasonal effects** # - Weekend retail sales spikes. # - Holiday shopping. # - Energy requirement changes with annual weather patterns. # **Random Fluctuations** # - The human element. # - Aggregations of small influencers. # - Observation errors. # - The smaller this is in relation to trend and seasonal components, the better we can predict the future. # ### Additive vs Multiplicative # # - Two simple categories of time series models: # # **Additive** # - Data = Trend + Seasonal + Random. # - If our seasonality and fluctuations are stable, we likely have an additive model. # **Multiplicative** # - Data = Trend $\times$ Seasonal $\times$ Random. # - As easy to fit as Additive if we take the log: ``log(Data) = log(Trend + Seasonal + Random)``. # - Use multiplicative models if: # - the amplitude in seasonal and random fluctuations grows with the trend, # - the percentage change of our data is more important than the absolute value change (e.g. stocks, commodities). # ### Stationarity # # A time series is stationary if: # # - The mean of the series is constant. # - The variance does not change over time (homoscedasticity). # - The covariance is not a function of time. # #

# ### Detect non-stationarity # # - Make a run-sequence plot. # - Rolling statistics: # - Compute and plot rolling statistics such as moving average/variance on chunks of data. # - Check if the statistics change over time. # - This technique can be done on different windows (small windows are noisy, large windows too conservative). # - Augmented Dickey-Fuller (ADF) test: # - Statistical tests for checking stationarity. # - The null hypothesis $H_0$ is that the time series is non-stationary. # - If the test statistic is small enough and the $p$-value below the target $\alpha$, we _can_ reject $H_0$ and say that the series is stationary. # ### Achieve stationarity # # - Take the log of the data. # - Difference (multiple times if needed) to remove trends and seasonality. # - Subtract estimated trend and seasonal components. # ## Unit root test # - The ADF is one of the most popular unit root tests. # - The presence of a unit root suggests that the time series is generated by a stochastic process with some level of persistence. # - This means that shocks to the system will have permanent effects. # - This is opposed to stationary processes where shocks have only temporary effects. # # > Where does the term *unit root* comes from? # - Consider a simple autoregressive process of order 1, denoted as AR(1). # - This is represented as: # # $$Y(t) = \phi Y(t-1) + \epsilon_t,$$ # # - where # - $Y(t)$ is the value of the series at time $t$, # - $\phi$ is a coefficient, # - $\epsilon_t$ is a white noise error term. # - To analyze the properties of the AR(1), we can rewrite its equation in terms of **lag operator** $L$ where $LY(t) = Y(t-1)$. # - The equation becomes: # # $$(1 - \phi L)Y(t) = \epsilon_t$$ # # - The term $(1 - \phi L)$ is known as the **characteristic equation** of the AR(1) process. # - The roots of this equation are found by setting $1 - \phi L = 0$ and solving for $L$. # - The solution gives $L = 1/\phi$, which is the **root of the characteristic equation**. # - If $\phi = 1$, then $L = 1/\phi = 1$, meaning that the root of the characteristic equation has a "unit" value. # - This is what is known as a "unit root". # - When an AR(1) process has a unit root (i.e., $\phi = 1$), it implies that the series is non-stationary as it becomes a random walk: $Y(t) = Y(t-1) + \epsilon_t$ # - The value of the series will be highly dependent on its previous values (the shocks to the series have a permanent effect). # # > In conclusion, the term "unit root" describes a specific condition in the autoregressive representation of a time series, indicating that the series is **non-stationary**. # ### More on the ADF test # # **Formulation of the test** # - The basic idea of the ADF test is to assess whether lagged values of the time series are useful in predicting current values. # - The test starts with a model that includes the time series lagged by one period (lag-1). # - Then, other lagged terms are added to control for higher-order correlation (this is the "augmented" part of the ADF test). # **ADF test equation** # # The ADF test model the time series with the following equation: # # $$\Delta Y(t) = \alpha + \beta t + \gamma Y(t-1) + \sum_{i=1}^{p} \delta_i \Delta_{t-i} + \epsilon_t$$ # # - where # - $\Delta_t = Y(t) - Y(t-1)$ is the difference of the series at time $t$, # - $\alpha$ is a constant, # - $\beta$ represents the trend, # - $\gamma$ is the coefficient on the lagged value of the series, # - $\delta_i$ terms are coefficients for the lagged differences (account for higher-order correlations), # - $\epsilon_t$ error term. # **ADF test for AR(1)** # # - Where does the model used in the ADF test come from? # - Consider the simple case of an AR(1) model: # # $$ # Y(t) = \phi Y(t-1) + \epsilon_t # $$ # # - Subtract $Y(t-1)$ from both sides: # # \begin{align*} # Y(t) - Y(t-1) & = \phi Y(t-1) - Y(t-1) + \epsilon_t \\ # \Delta Y(t) &= (\phi - 1) Y(t-1) + \epsilon_t # \end{align*} # This is exactly the ADF model above when: # - $p=1$ # - $\alpha=\beta=0$ (zero-mean and no trend) # - $\gamma = \phi - 1$ # **Connection to unit root** # # - If $\gamma = 0$, then $\phi = 1$ (unit root). # - This means that # # $$\Delta Y(t) = \epsilon_t \implies Y(t) = Y(t-1) + \epsilon_t,$$ # # - which is the non-stationary random walk. # **Null and alternative hypotheses (revised)** # - $H_0$: $\gamma = 0$ the time series has unit root, i.e., it is not stationary. # - $H_1$: $\gamma < 0$ the series does not have a unit root. # **Test statistics**: # - The ADF test statistic is the coefficient $\hat{\gamma}$, estimated from data. # - The estimated statistic $\hat{\gamma}$ is then compared to critical values for the ADF distribution. # - If the test statistic is more negative than the critical value, $H_0$ is rejected. # - If the test statistic is less negative than the critical value, $H_0$ cannot be rejected. # - For example, consider a confidence level $\alpha=0.05$. # - According to the distribution of $\gamma$, it correspond to a critical value of $\gamma_\alpha=-2.89$. # - If the test statistic $\hat\gamma$ is more negative than the critical value $\gamma_\alpha$, $H_0$ is rejected. # - If the test statistic $\hat\gamma$ is less negative than the critical value $\gamma_\alpha$, $H_0$ cannot be rejected. # **Choosing lag length**: # - The number of lags ($p$) included in the test equation is important. # - Too few lags might leave out necessary corrections for autocorrelation. # - Too many lags can reduce the power of the test. # - The appropriate lag length is often chosen based on information criteria such as the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC). # In[3]: # Generating a random walk time series data = pd.Series(100 + np.random.normal(0, 1, 100).cumsum()) # Perform Augmented Dickey-Fuller test. The lag can be set manually # with 'maxlag' or inferred automatically with autolag result = adfuller(data, autolag='AIC') # You can change to 'BIC' adf_statistic, p_value, usedlag, nobs, critical_values, icbest = result print(f'ADF Statistic: {adf_statistic :.2f}') print(f'p-value: {p_value :.2f}') print(f'Used Lag: {usedlag}') print(f'Number of Observations: {nobs}') print(f"Critical Values: {[f'{k}: {r:.2f}' for r,k in zip(critical_values.values(), critical_values.keys())]}\n") # **Types of ADF tests** # - There are three versions of the ADF test depending on whether the equation includes the constant term $\alpha$ and the trend $\beta t$: # - No constant or trend (`'n'`). # - Constant, but no trend (`'c'`). # - Both constant and trend (`'ct'`). # - The choice among these depends on the nature of the underlying time series and on what we want to test. # In[4]: # Function to perform ADF test def perform_adf_test(series, title, regression_type): out = adfuller(series, regression=regression_type) print(f"Results for {title}:") print(f'ADF Statistic: {out[0]:.2f}') print(f'p-value: {out[1]:.3f}') print(f"Critical Values: {[f'{k}: {r:.2f}' for r,k in zip(out[4].values(), out[4].keys())]}\n") # 1. No Constant, no Trend series_no_const_no_trend = pd.Series(np.random.normal(0, 1, 200)) # 2. Constant, but no Trend series_const_no_trend = pd.Series(50 + np.random.normal(0, 1, 200)) # 3. Both Constant and Trend series_const_trend = pd.Series(50 + np.linspace(0, 20, 200) + np.random.normal(0, 1, 200)) # In[5]: plt.figure(figsize=(12, 4)) plt.subplot(1, 3, 1) series_no_const_no_trend.plot(title='No Constant, No Trend') plt.subplot(1, 3, 2) series_const_no_trend.plot(title='Constant, but No Trend') plt.subplot(1, 3, 3) series_const_trend.plot(title='Both Constant and Trend') plt.tight_layout(); # In[6]: # 1. No Constant or Trend perform_adf_test(series_no_const_no_trend, "No Constant, No Trend", 'n') # 2. Constant, but No Trend perform_adf_test(series_const_no_trend, "Constant, No Trend", 'c') # 3. Both Constant and Trend perform_adf_test(series_const_trend, "Constant and Trend", 'ct') # - In the last two cases, without accounting for the non-zero mean and the trend we would not reject $H_0$. # - Try using ``'n'`` in the last two examples and see for yourself. # # --- # ## Mean reversion test # - Mean reversion refers to the property of a time series to revert to its historical mean. # - This concept is particularly popular in financial economics, where it is often assumed that asset prices and revert to their historical average over the long term. # - There is no proper mathematical definition of mean reversion. # ### Applications in finance # # 1. **Portfolio Management** # - Investors use mean reversion as a strategy to buy assets that have underperformed and sell assets that have overperformed, expecting that they will revert to their historical mean. # 2. **Risk Management** # - Understanding mean reversion helps in assessing the long-term risk of assets. # - If an asset is highly mean-reverting, it might be considered less risky over the long term, as it tends to move back to its average. # 3. **Economic Forecasting** # - Economic variables (like GDP growth rates, interest rates) often exhibit mean-reverting behavior. # - This assumption is used in macroeconomic models and forecasts. # ### Testing for mean reversion # - Determines whether, after a deviation from its mean, a time series will eventually revert back to that mean. # - This can be done using unit root tests such as ADF. # - If a time series has a unit root, it implies that it does not revert to a mean. # - The absence of a unit root can be indicative of mean reversion. # **Example: Google stocks** # # We will download the Google Open-High-Low-Close-Volume (GOOG OHLCV) data from the 1st September 2004 to 31st August 2020 from [Yahoo finance](https://finance.yahoo.com/) using the ``yfinance`` Python package. # In[7]: def get_data(tickerSymbol, period, start, end): # Get data on the ticker tickerData = yf.Ticker(tickerSymbol) # Get the historical prices for this ticker tickerDf = tickerData.history(period=period, start=start, end=end) return tickerDf data = get_data('GOOG', period='1d', start='2004-09-01', end='2020-08-31') # In[8]: # Plotting the Closing Prices plt.figure(figsize=(14, 5)) plt.plot(data['Close'], label='GOOG Closing Price') plt.title('Google Stock Closing Prices (2004-2020)') plt.xlabel('Date') plt.ylabel('Price (USD)') plt.legend(); # In[9]: # Perform the ADF test perform_adf_test(data['Close'],"Google Stock Closing Prices", 'ct') # - We cannot reject $H_0$. # - It does not look like a mean reverting time series. # ### Mean reversion VS stationarity # # Testing for mean reversion and testing for stationarity are related but distinct concepts in time series analysis. # # **Key Differences**: # - Mean reversion testing is focused on whether a time series will return to a specific level (the mean). # - Stationarity testing checks if the overall statistical properties of the series remain consistent over time. # **Key Differences (cont.)**: # - A stationary time series may or may not be mean reverting. # - A stationary series with a constant mean and variance over time might still not revert to its mean after a shock. # - Conversely, a mean-reverting series must have some stationarity, particularly in its mean, but or other properties (e.g., variance) might change over time. # ## Hurst exponent # - The Hurst exponent ($H$) is a measure used to characterize the long-term memory of time series. # - It helps to determine the presence of autocorrelation or persistence in the data. # - The goal of the Hurst exponent is to provide us with a scalar value that will help us to identify whether a series is # - random walking, # - trending, # - mean reverting. # - The key insight is that, if any *autocorrelation* exists, then # # $$\text{Var}(X(t + \tau) - X(t)) \propto \tau^{2H}$$ # # - where $H$ is the Hurst exponent. # A time series can then be characterised in the following manner: # # - If $H = 0.5$, the time series is similar to a random walk. In this case, the variance increases *linearly* with $\tau$. # - If $H < 0.5$, the time series exhibits anti-persistence, i.e. mean reversal. The variance increases *less than linearly* with $\tau$. # - If $H > 0.5$, the time series exhibits persistent long-range dependence, i.e. is trending. The variance increases *more than linearly* with $\tau$. # In[10]: def hurst(ts): # Create the range of lag values lags = range(2, 100) # Calculate the array of the variances of the lagged differences tau = [np.var(np.subtract(ts[lag:], ts[:-lag])) for lag in lags] # Use a linear fit to estimate the Hurst exponent poly = np.polyfit(np.log(lags), np.log(tau), 1) # Return the Hurst exponent from the polyfit output return poly[0]/2.0 # **How do persistent and anti-persistent time series look like?** # # - Let's create a Python script (adapted from [here](https://github.com/Mottl/hurst/tree/master)) to generate random walk with persistence. # - We must incorporate a form of "memory" by considering past values within a specified look-back window to decide the direction of the next step. # Let's define # # ```python # def random_walk_memory(length, proba, min_lookback, max_lookback) # ``` # # where: # - ``proba`` is the probability that the next increment will follow the trend. # - proba > 0.5 persistent random walk # - proba < 0.5 antipersistent one # # - ``min_lookback`` and ``max_lookback`` are the minimum and maximum window sizes to calculate trend direction # In[11]: def random_walk_memory(length, proba=0.5, min_lookback=1, max_lookback=100): series = [0.] * length for i in range(1, length): # If the series has not yet reached the min_lookback threshold # the direction of the step is random (-1 or 1) if i < min_lookback + 1: direction = np.sign(np.random.randn()) # consider the historical values to determine the direction else: # randomly choose between min_lookback and the minimum of # i-1 (to ensure not exceeding the current length) and max_lookback. lookback = np.random.randint(min_lookback, min(i-1, max_lookback)+1) # Decides whether to follow the recent trend or move against it, # based on a comparison between proba and a random number between 0 and 1. recent_trend = np.sign(series[i-1] - series[i-1-lookback]) change = np.sign(proba - np.random.uniform()) direction = recent_trend * change series[i] = series[i-1] + np.fabs(np.random.randn()) * direction return series # In[24]: bm = random_walk_memory(2000, proba=0.5) persistent = random_walk_memory(2000, proba=0.7) antipersistent = random_walk_memory(2000, proba=0.3) _, axes = plt.subplots(1,3, figsize=(24, 4)) axes[0].plot(bm) axes[0].set_title(f"Random Walk, H: {hurst(bm):.2f}") axes[1].plot(persistent) axes[1].set_title(f"Persistent, H: {hurst(persistent):.2f}") axes[2].plot(antipersistent) axes[2].set_title(f"Anti-Persistent, H: {hurst(antipersistent):.2f}"); # - What is the Hurst exponent of the Google stocks closing price? # In[13]: print(f"GOOG closing price, H: {hurst(data['Close'].values):.2f}") # - Google stocks' time series approaches $H=0.5$, indicating that it is close to a geometric random walk (at least for the sample period we're looking at). # - What does this mean in practice? # ### Interpretation of the Hurst exponent in finance: # # - The Hurst exponent, $H$, is a critical metric in the analysis of financial time series. # - Offers insights into the behavior of assets such as stocks. # - Here's how to interpret $H$ in the context of *closing stock prices* and its influence on *investment decisions*. # **Case 1: $H = 0.5$** # # - This value suggests that the stock price follows a random walk. # - It implies that future price movements are independent of past movements. # - For investors, this means that there are no autocorrelations in price movements to exploit; past data cannot predict future prices. # - Trading strategies based on historical price patterns or trends are, theoretically, no more likely to be successful than random guessing. # - Investors should rely on broader market analysis or diversified portfolio strategies rather than trying to predict future prices based on past trends. # **Case 2: $H < 0.5$** # # - Indicates a mean-reverting series, i.e., the stock price tends to revert to its historical average. # - This suggests that the asset is less risky over the long term. # - Investors might interpret a low $H$ as an opportunity to buy stocks after a significant drop, expecting a reversion to the mean, or to sell after a substantial rise. # **Case 3: $H > 0.5$** # # - Suggests a trending series, where increases or decreases in stock prices are likely to be followed by further increases or decreases, respectively. # - This persistence indicates potential momentum in stock prices, which can be exploited by momentum strategies: # - Buying stocks that have been going up in the hope that they will continue to do so, and selling those in a downtrend. # # #### Practical considerations: # # Before applying these interpretations, investors should also make the following considerations: # - The observed value of $H$ can vary over different time frames: analyze $H$ over the period relevant to the investment horizon. # - External factors such as market conditions, economic indicators, and geopolitical events can influence stock prices and should be considered alongside $H$. # - Regardless of the value of $H$, implementing proper risk management strategies is essential to protect against unexpected market movements. # - If the time series is too short, the value of $H$ might not be reliable. # # --- # ## Brownian Motion # - Recall the random walk # # $$X(t) = X(t-1) + \epsilon_t$$ # # - where $\epsilon_t \sim \mathcal{N}(0,\sigma^2)$. # - This can also be expressed as: # # $$X(t) = X(0) + W(t)$$ # # - where $W(t)$ is called *Wiener Process*, or as: # # $$X(t) = X(0) + B(t)$$ # # - where $B(t)$ is called *Brownian Motion*. # - $W(t)$ and $B(t)$ represents the cumulative sum of normally distributed increments $\epsilon_1 + \epsilon_2 + \dots \epsilon_t$. # **Variance of Brownian Motion at time lag $\tau$** # # - For a time series $X(t)$, the variance at a time lag $\tau$ can be defined as: # # $$\text{Var}(X(t + \tau) - X(t))$$ # - Given $B(t)$ as Brownian motion, the increment $B(t + \tau) - B(t)$ is normally distributed with mean 0 and variance $\tau$. # - This is because the variance of the increments of Brownian motion over a period $\tau$ is $\tau$. # # $$\text{Var}(B(t + \tau) - B(t)) = \tau$$ # # --- # ## Geometric Brownian Motion # - The *Geometric Brownian Motion* (GBM) is a stochastic process. # - Is is often used to model stock prices and other financial variables that do not revert to a mean but rather exhibit trends with a drift $\mu$ and volatility $\sigma$. # The GBM is defined as: # # $$S(t) = S_0 \exp\left((\mu - \frac{1}{2} \sigma^2)t + \sigma W(t)\right)$$ # # - where # - $S(t)$ is the stock price at time $t$. # - $S_0$ is the initial stock price at time $t = 0$. # - $\mu$ is the expected annual return (drift coefficient). # - $\sigma$ is the volatility (standard deviation of returns). # - $W(t)$ is a Wiener process (standard Brownian motion). # In[14]: S0 = 100 # Initial stock price mu = 0.09 # Expected annual return (9%) sigma = 0.25 # Annual volatility (25%) T = 2 # Time horizon in years dt = 1/252 # Time step in years, assuming 252 trading days per year N = int(T/dt) # Number of time steps t = np.linspace(0, T, N) # Time vector # Brownian Motion dW = np.random.normal(0, np.sqrt(dt), N) W = np.cumsum(dW) # Geometric Brownian Motion S = S0 * np.exp((mu - 0.5 * sigma**2) * t + sigma * W) plt.figure(figsize=(12, 4)) plt.plot(S) plt.title('Geometric Brownian Motion'); # ### Variance of GBM # # - First, note that $S(t)$ is *log-normally distributed* because it's an exponential function of a normally distributed process $B(t)$. # - The variance of $S(t)$ can be found from the properties of the log-normal distribution: # # $$\text{Var}(S(t)) = \left(e^{\sigma^2 t} - 1\right) e^{2\mu t + \sigma^2 t} S_0^2.$$ # # - This shows that the variance of GBM is not linear in $t$ like the BM. # - Instead, it grows *exponentially* with time due to the exponential term $e^{\sigma^2 t}$. # - This, and the possibility of modelling drift (expected annual return) are the main additions of GBM over BM. # ### Real world applications # # - GBM can be used to model real stock prices and simulate their future behavior. # - First, we estimate $\mu$ and $\sigma$ from historical stock price data. # - $\mu$ could be the historical average of the stock's logarithmic returns. # - $\sigma$ could be the standard deviation of those returns. # - Then, we use these estimates in the GBM formula to simulate future price paths. # - This method is widely used for option pricing, risk management, and investment strategy simulations. # - However, GBM has limitations, such as assuming a *constant* drift ($\mu$) and volatility ($\sigma$). # - These assumptions may not hold true in real markets. # - Therefore, it's often used as a component of a broader analysis or modeling strategy. # In[15]: # Step 1: Get the "training" data (e.g., 2020-2022) data2 = get_data('GOOG', period='1d', start='2019-12-31', end='2022-12-31') # In[16]: # Get "test" data, for comparison (e.g., 2023) data3 = get_data('GOOG', period='1d', start='2022-12-31', end='2023-12-31') test_days = len(data3) # In[17]: # Step 2: Calculate Daily Returns returns = data2['Close'].pct_change() # We are interested in returns, so we get the changes in % # In[18]: # Step 3: Estimate Parameters for GBM mu = returns.mean() * 252 # Annualize the mean sigma = returns.std() * np.sqrt(252) # Annualize the std deviation # In[19]: # Step 4: Set GBM parameters T = 1 # Time horizon in years dt = 1/test_days # Time step in years, assuming 252 trading days per year N = int(T/dt) # Number of time steps time_step = np.linspace(0, T, N) S0 = data2['Close'].iloc[-1] # Starting stock price (latest close price) # In[25]: # Step 5: Compute Simulation W = np.random.standard_normal(size=N) W = np.cumsum(W)*np.sqrt(dt) # Cumulative sum for the Wiener process X = (mu - 0.5 * sigma**2) * time_step + sigma * W S = S0 * np.exp(X) # GBM formula # In[26]: # Plot the results plt.figure(figsize=(12, 5)) plt.plot(data2['Close'], label='GOOGL Historical Closing Prices') plt.plot(data3.index, S, label='Simulated GBM Prices') plt.plot(data3['Close'], label='Real Prices') plt.legend() plt.title('Google Stock Prices and Simulated GBM') plt.xlabel('Date') plt.ylabel('Price') plt.xticks(rotation=45); # - There is a stochastic component in each GBM realization.. # - Usually, more paths are simulated to get a more informative simulation. # In[22]: # Simulate multiple paths n_paths = 10 paths = [] for _ in range(n_paths): W = np.cumsum(np.random.standard_normal(size=N))*np.sqrt(dt) X = (mu - 0.5 * sigma**2) * time_step + sigma * W paths.append(S0 * np.exp(X)) path_mean = np.array(paths).mean(axis=0) path_std = np.array(paths).std(axis=0) # In[23]: plt.figure(figsize=(12, 5)) plt.plot(data2['Close'], label='GOOGL Historical Closing Prices') plt.plot(data3.index, path_mean, label='Simulated GBM Prices') plt.fill_between(data3.index, path_mean-1.96*path_std, path_mean+1.96*path_std, color='tab:orange', alpha=0.2, label='95% CI') plt.plot(data3['Close'], label='Real Prices') plt.legend(loc='upper left') plt.title('Google Stock Prices and Simulated GBM') plt.xlabel('Date') plt.ylabel('Price') plt.xticks(rotation=45); # --- # ## Summary # # In this lecture we covered: # 1. Unit root test and a deeper insight on ADF test. # 2. Mean Reversion and how to test it with ADF. # 3. Hurst Exponent: how to compute it and its application in finance. # 4. Geometric Brownian Motion: definition and application to simulate stocks. # # --- # ## Exercise # # - Download and plot the historical closing prices of Tesla (``TSLA``) and Equinor (``EQNR``) for the years ``2019-12-31``- ``2022-12-31``. # - For each time series: # - Test if the time series look stationary. # - Compute the Hurst coefficient for both time series. # - Which stock would you like to invest into? Motivate your answer based on the tests and the value of $H$. # - Simulate the stock prices using GBM. # - Which simulation seems to be more reliable? The one for Tesla or Equinor? # - To motivate your answer: # 1. compute the simulation at least 100 times. # 2. Compute the MSE between the true stock prices and the simulated ones. # 3. Compare the expected value of the MAPE for the two stocks.