Notebook

Backtesting.py Quick Start User Guide¶

This tutorial shows some of the features of backtesting.py, yet another Python package for backtesting trading strategies.

Firstly, what backtesting.py is not: It is not a data source — you bring your own data. It does not support strategies that rely on multiple orders, hedging, position sizing, or multi-asset portfolio rebalancing. Instead, backtesting.py works with a single asset at a time, a single position at a time (long or short), and the position size is (as yet) non-adjustable, corresponding to 100% of available funds. Backtesting.py is not aware of order types and does not properly simulate, nor can be connected to, a broker.

As a trade-off, backtesting.py is a blazing fast, small and lightweight backtesting library that uses state-of-the-art Python data structures and procedures. The entire API easily fits into memory banks of a single human individual. It's best suited for optimizing position entrance and exit strategies, decisions upon values of technical indicators, and it's also a versatile interactive trading strategy visualization tool.

Data¶

You bring your own data. Backtesting ingests data as a pandas.DataFrame with columns 'Open', 'High', 'Low', 'Close', and (optionally) 'Volume'. Such data is easily obtainable (see e.g. pandas-datareader, Quandl, findatapy, ...). Your data frames can have other columns, but these are necessary. DataFrame should ideally be indexed with a datetime index (convert it with pd.to_datetime()), otherwise a simple range index will do.

In [1]:

# Example OHLC data for Google Inc.
from backtesting.test import GOOG

GOOG.tail()

/home/jk/Documents/projects/trading/python/notebooks/venv/lib/python3.5/site-packages/backtesting/backtesting.py:26: UserWarning: Using tqdm in Jupyter Notebook mode. Raise an issue if you experience problems.
  warnings.warn('Using tqdm in Jupyter Notebook mode. '
/home/jk/Documents/projects/trading/python/notebooks/venv/lib/python3.5/site-packages/backtesting/_plotting.py:34: UserWarning: Jupyter Notebook detected. Setting Bokeh output to notebook. This may not work in Jupyter clients without JavaScript support (e.g. PyCharm, Spyder IDE). Reset with `bokeh.io.reset_output()`.
  warnings.warn('Jupyter Notebook detected. '

Loading BokehJS ...

Out[1]:

	Open	High	Low	Close	Volume
2013-02-25	802.30	808.41	790.49	790.77	2303900
2013-02-26	795.00	795.95	784.40	790.13	2202500
2013-02-27	794.80	804.75	791.11	799.78	2026100
2013-02-28	801.10	806.99	801.03	801.20	2265800
2013-03-01	797.80	807.14	796.15	806.19	2175400

Strategy¶

Let's create our first strategy to backtest on these Google data. Let it be a simple moving average (MA) cross-over strategy.

Backtesting.py doesn't contain its own set of technical indicators. In practice, the user should probably use functions from their favorite indicator library, such as TA-Lib, Tulipy, PyAlgoTrade, ... But for this example, we define a simple helper moving average function.

In [2]:

import pandas as pd


def SMA(values, n):
    """
    Return simple moving average of `values`, at
    each step taking into account `n` previous values.
    """
    return pd.Series(values).rolling(n).mean()

Note, this is the exact same helper function as the one used in the project unit tests, so we could just import that instead.

In [3]:

from backtesting.test import SMA

A custom strategy needs to extend backtesting.Strategy class and override its two methods: init() and next().

Method init() is invoked at the beginning, before the strategy is run. Within it, one ideally precomputes in efficient, vectorized fashion whatever indicators and signals the strategy depends on.

Method next() is iteratively called by the backtest instance, once for each data point (data frame row), simulating the incremental availability of each new full candlestick bar. Note, backtesting.py cannot make decisions / trades within candlesticks — any trade is executed on the next candle's open (or the current candle's close, see Backtest(trade_on_close). If you need to trade within candlesticks (e.g. daytrading), instead begin with more fine-grained (e.g. hourly) data.

In [4]:

from backtesting import Strategy
from backtesting.lib import crossover


class SmaCross(Strategy):
    
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 10
    n2 = 20
    
    def init(self):
        # Precompute two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)
    
    def next(self):
        # If sma1 crosses above sma2, buy the asset
        if crossover(self.sma1, self.sma2):
            self.buy()

        # Else, if sma1 crosses below sma2, sell it
        elif crossover(self.sma2, self.sma1):
            self.sell()

In init() as well as in next(), the data the strategy is simulated on is available as an instance variable self.data.

In init(), we compute indicators indirectly by wrapping them in self.I(). The wrapper is passed a function (here, our SMA function) along with any arguments to call it with (here, our close values and the MA lag). Indicators wrapped in this way will be plotted, and their names, intelligently inferred, will appear in the plot legend.

In next(), we simply check if the faster moving average just crossed over the slower one. If it did and upwards, we go long; if it did and downwards, we close any open long position and go short. Note, there is no position size to adjust; Backtesting.py assumes maximal possible position. We use backtesting.lib.crossover() function instead of writing more obscure and confusing conditions, such as:

def next(self): if (self.sma1[-2] < self.sma2[-2] and self.sma1[-1] > self.sma2[-1]): self.buy() elif (self.sma1[-2] > self.sma2[-2] and self.sma1[-1] < self.sma2[-1]): self.sell()

Ugh!

In init(), the whole series of points was available, whereas in next(), the length of self.data and any declared indicator arrays is adjusted on each next() call so that array[-1] (e.g. self.data.Close[-1] or self.sma1[-1]) always contains the most recent value, array[-2] the previous value, etc. (ordinary Python indexing of ascending-sorted 1D arrays).

Note: self.data and any indicators wrapped with self.I (e.g. self.sma1) are NumPy arrays for performance reasons. If you need pandas.Series, use .to_series() method (e.g. self.data.Close.to_series()) or construct the series manually (e.g. pd.Series(self.data.Close, index=self.data.index)).

Let's see how our strategy performs on historical Google data. We begin with 10,000 units of cash and set broker's commission to realistic 0.2%.

In [5]:

from backtesting import Backtest

bt = Backtest(GOOG, SmaCross, cash=10000, commission=.002)
bt.run()

Out[5]:

Start                     2004-08-19 00:00:00
End                       2013-03-01 00:00:00
Duration                   3116 days 00:00:00
Exposure [%]                            94.29
Equity Final [$]                     69665.12
Equity Peak [$]                      69722.15
Return [%]                             596.65
Buy & Hold Return [%]                  703.46
Max. Drawdown [%]                      -33.61
Avg. Drawdown [%]                       -5.68
Max. Drawdown Duration      689 days 00:00:00
Avg. Drawdown Duration       41 days 00:00:00
# Trades                                   93
Win Rate [%]                            53.76
Best Trade [%]                          56.98
Worst Trade [%]                        -17.03
Avg. Trade [%]                           2.44
Max. Trade Duration         121 days 00:00:00
Avg. Trade Duration          32 days 00:00:00
Expectancy [%]                           6.92
SQN                                      1.77
Sharpe Ratio                             0.22
Sortino Ratio                            0.54
Calmar Ratio                             0.07
_strategy                            SmaCross
dtype: object

The Backtest instance is initialized with data and a strategy class (see API reference for additional options).

When Backtest.run() method is called, it returns a pandas Series of simulation results and statistics associated with our strategy. We see that this simple strategy makes 600% return in the period of 9 years, with maximal drawdown 33%, and with longest drawdown period spanning almost two years ...

Backtest.plot() method provides the same results in a more visual form.

In [6]:

bt.plot()

Optimization¶

We hard-coded the two lag parameters (n1 and n2) into our strategy above. However, the strategy may work better with 15–30 or some other cross-over. We define the parameters as optimizable by making them class variables.

We optimize the two parameters by calling Backtest.optimize() method with each parameter a keyword argument pointing to its pool of values to test. Parameter n1 is tested for values in range between 5 and 30 and parameter n2 for values between 10 and 70, respectively. Some combinations of values of the two parameters are invalid, i.e. n1 should not be larger than or equal to n2. We limit admissible parameter combinations with an ad hoc constraint function, which returns True (admissible) whenever n1 is less than n2. Additionally, we search for such parameter combination that maximizes return over the observed period. We could instead choose to optimize any key from the returned stats series.

In [7]:

%%time

stats = bt.optimize(n1=range(5, 30, 5),
                    n2=range(10, 70, 5),
                    maximize='Equity Final [$]',
                    constraint=lambda p: p.n1 < p.n2)

HBox(children=(IntProgress(value=0, max=9), HTML(value='')))

CPU times: user 175 ms, sys: 59.5 ms, total: 234 ms
Wall time: 1.33 s

In [8]:

stats

Out[8]:

Start                       2004-08-19 00:00:00
End                         2013-03-01 00:00:00
Duration                     3116 days 00:00:00
Exposure [%]                              98.14
Equity Final [$]                      106429.70
Equity Peak [$]                       109515.30
Return [%]                               964.30
Buy & Hold Return [%]                    703.46
Max. Drawdown [%]                        -43.98
Avg. Drawdown [%]                         -5.70
Max. Drawdown Duration        690 days 00:00:00
Avg. Drawdown Duration         36 days 00:00:00
# Trades                                    152
Win Rate [%]                              51.32
Best Trade [%]                            60.81
Worst Trade [%]                          -20.80
Avg. Trade [%]                             1.90
Max. Trade Duration            83 days 00:00:00
Avg. Trade Duration            21 days 00:00:00
Expectancy [%]                             5.97
SQN                                        1.51
Sharpe Ratio                               0.19
Sortino Ratio                              0.49
Calmar Ratio                               0.04
_strategy                 SmaCross(n1=10,n2=15)
dtype: object

We can look into stats._strategy to access the Strategy instance and its optimal parameter values (10 and 15).

In [9]:

bt.plot()

Strategy optimization managed to up its initial performance on in-sample data by almost 70% and beat buy & hold. In real life, however, always take steps to avoid overfitting before putting real money at risk.

Learn more by exploring further examples or find more program options in the full API reference.