This notebook will walk through a basic example of using the Stocker class to analyze a stock. The Stocker class is built on the quandl financial library and the fbprophet additive model library but hides all that code behind the scenes so you can focus on making sense of the data!
# Command for plotting in the notebook
import matplotlib.pyplot as plt
%matplotlib inline
Make sure to run this notebook in the same directory as stocker.py
from stocker import Stocker
An object is an instance of a Python class. To create a stocker object, we call the Stocker class with a valid stock ticker (there are over 3000 available). We will be using Microsoft throughout this example. If successful, the call will display that the data was retrieved and the date range.
microsoft = Stocker('MSFT')
MSFT Stocker Initialized. Data covers 1986-03-13 to 2018-01-22.
The Stocker object contains a number of attributes, or pieces of data, and methods, functions that act on that data. One attribute is the stock history, which is a dataframe. We can assign this to a variable and then look at the dataframe.
stock_history = microsoft.stock
stock_history.head()
Date | Open | High | Low | Close | Volume | Ex-Dividend | Split Ratio | Adj. Open | Adj. High | Adj. Low | Adj. Close | Adj. Volume | ds | y | Daily Change | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1986-03-13 | 25.50 | 29.25 | 25.5 | 28.00 | 3582600.0 | 0.0 | 1.0 | 0.058941 | 0.067609 | 0.058941 | 0.064720 | 1.031789e+09 | 1986-03-13 | 0.064720 | 0.005779 |
1 | 1986-03-14 | 28.00 | 29.50 | 28.0 | 29.00 | 1070000.0 | 0.0 | 1.0 | 0.064720 | 0.068187 | 0.064720 | 0.067031 | 3.081600e+08 | 1986-03-14 | 0.067031 | 0.002311 |
2 | 1986-03-17 | 29.00 | 29.75 | 29.0 | 29.50 | 462400.0 | 0.0 | 1.0 | 0.067031 | 0.068765 | 0.067031 | 0.068187 | 1.331712e+08 | 1986-03-17 | 0.068187 | 0.001156 |
3 | 1986-03-18 | 29.50 | 29.75 | 28.5 | 28.75 | 235300.0 | 0.0 | 1.0 | 0.068187 | 0.068765 | 0.065876 | 0.066454 | 6.776640e+07 | 1986-03-18 | 0.066454 | -0.001734 |
4 | 1986-03-19 | 28.75 | 29.00 | 28.0 | 28.25 | 166300.0 | 0.0 | 1.0 | 0.066454 | 0.067031 | 0.064720 | 0.065298 | 4.789440e+07 | 1986-03-19 | 0.065298 | -0.001156 |
The advantages of a class is that the data and the functions which act on the data are associated with a single variable. In effect, a class is an uber data-structure because it contains within it other data and functions. Here we will use one of the Stocker methods to plot the history of the stock.
First, a basic plot of the stock history and a few statistics.
microsoft.plot_stock()
Maximum Adj. Close = 91.61 on 2018-01-22. Minimum Adj. Close = 0.06 on 1986-03-24. Current Adj. Close = 91.61 on 2018-01-22.
The plot_stock method accepts a number of arguments that control the range of data plotted, the statistics plotted, and the type of plot. In a Jupyter notebook, you can type a function, and with your cursor in the parenthesis, press shift + tab to view all the available function parameters. Here we will plot the daily change in price and the daily volumn as a percentage relative to the average value.
microsoft.plot_stock(start_date = '2000-01-03', end_date = '2018-01-16',
stats = ['Daily Change', 'Adj. Volume'], plot_type='pct')
Maximum Daily Change = 2.08 on 2008-10-13. Minimum Daily Change = -3.34 on 2017-12-04. Current Daily Change = 1.61 on 2018-01-22. Maximum Adj. Volume = 591052200.00 on 2006-04-28. Minimum Adj. Volume = 7425503.00 on 2017-11-24. Current Adj. Volume = 23190700.00 on 2018-01-22.
If we want to feel good about ourselves, we can pretend as if we had the fortune of mind to invest in Microsoft at the beginning with 100 shares. We can then evaluate the potential profit we would have from those shares. You can also change the dates if you feel like trying to lose money!
microsoft.buy_and_hold(start_date='1986-03-13', end_date='2018-01-16', nshares=100)
MSFT Total buy and hold profit from 1986-03-13 to 2018-01-16 for 100 shares = $8829.11
microsoft.buy_and_hold(start_date='1999-01-05', end_date='2002-01-03', nshares=100)
MSFT Total buy and hold profit from 1999-01-05 to 2002-01-03 for 100 shares = $-56.92
Surprisingly, we can lose money playing the stock market!
An additive model represents a time series as an overall trend and patterns on different time scales (yearly, monthly, seasonally). While the overall direction of Microsoft is positive, it might happen to decrease every Tuesday, which if true, we could use to our advantage in playing the stock market.
The Prophet library, developed by Facebook provides simple implementations of additive models. It also has advanced capabilities for those willing to dig into the code. The Stocker object does the tough work for us so we can use it to just see the results. Another method allows us to create a prophet model and inspect the results. This method returns two objects, model and data, which we need to save to plot the different trends and patterns.
model, model_data = microsoft.create_prophet_model()
model.plot_components(model_data)
plt.show()
The overall trend is clearly in the upwards direction over the past three years. The trend over the course of a year appears to be a decreased in July, September, and October with the greatest increases in December and January. As the time scale decreases, the patterns grow more noisy. The monthly pattern appears to be slightly random, and I would not place too much confidence in investing between the 26th and 28th of the month!
If we think there may be meaningful weekly trends, we can add in a weekly seasonality component by modifying the associated attribute on our Stocker object. We then recreate the model and plot the components.
print(microsoft.weekly_seasonality)
microsoft.weekly_seasonality = True
print(microsoft.weekly_seasonality)
False True
model, model_data = microsoft.create_prophet_model(days=0)
model.plot_components(model_data)
plt.show()
We have added a weekly component into the data. We can ignore the weekends because trading only occurs during the week (prices do slightly change overnight because of after-market trading, but the differences are small enough to not make affect our analysis). There is therefore no trend during the week. This is to be expected because on a short enough timescale, the movements of the market are essentially random. It is only be zooming out that we can see the overall trend. Even on a yearly basis, there might not be many patterns that we can discern. The message is clear: playing the daily stock market should not make sense to a data scientist!
# Turn off the weekly seasonality because it clearly did not work!
microsoft.weekly_seasonality=False
One of the most important concepts in a time-series is changepoints. These occur at the maximum value of the second derivative. If that doesn't make much sense, they are times when the series goes from increasing to decreasing or vice versa, or when the series goes from increasing slowly to increasing rapidly.
We can easily view the changepoints identified by the Prophet model with the following method. This lists the changepoints and displays them on top of the actual data for comparison.
microsoft.changepoint_date_analysis()
Changepoints sorted by slope rate of change (2nd derivative): Date Adj. Close delta 410 2016-09-08 55.811396 -1.378093 338 2016-05-26 50.113453 1.116720 217 2015-12-02 52.572008 -0.882359 458 2016-11-15 57.589819 0.603127 48 2015-04-02 37.612590 0.442776
Prophet only identifies changepoints in the first 80% of the data, but it still gives us a good idea of where the most movement happens. It we wanted, we could look up news about Microsoft on those dates and try to corroborate with the changes. However, I would rather have that done automatically so I built it into Stocker.
If we specify a search term in the call to changepoint_date_analysis
, behind the scenes, Stocker will query the Google Search Trends api for that term. The method then displays the top related queries, the top rising queries, and provides a graph. The graph is probably the most valuable part as it shows the frequency of the search term and the changepoints on top of the actual data. This allows us to try and corroborate the search term with either the changepoints or the share price.
microsoft.changepoint_date_analysis(search = 'Microsoft profit')
Top Related Queries: query value 0 microsoft non profit 100 1 microsoft office 60 2 apple profit 40 3 microsoft 365 40 4 apple 35 Rising Related Queries: query value 0 apple stock 170 1 microsoft 365 130 2 apple profit 50
There looks to be more signal in the search frequency graph than noise! I'm sure there may be correlations, but the question is whether there are meaningful causes. We can use any search term we want, and there are likely to be all sorts of correlations that are unexpected but are just noise. It might not be a great idea to assign the search frequency much weight. Nonetheless, it is an interesting exercise!
microsoft.changepoint_date_analysis(search = 'Microsoft Office')
Top Related Queries: query value 0 microsoft office download 100 1 microsoft office 2010 90 2 office 2010 85 3 microsoft office 2013 75 4 office 2013 70 Rising Related Queries: query value 0 microsoft office 2016 key 80300 1 office 2016 73200 2 download microsoft office 2016 72150 3 microsoft office 2016 mac 69350 4 microsoft office 2016 67650
Now that we have analyzed the stock, the next question is where is it going? For that we will have to turn to predictions! That is for another notebook, but here is a little idea of what we can do (check out the documentation on GitHub for full details).
model, future = microsoft.create_prophet_model(days=180)
Predicted Price on 2018-07-21 = $102.40