Using Python for Strategy Creation and Backtesting

I am writing a quick overview of how you can start backtesting your strategies using Python. I am writing this summary from the book Trading Evolved: Anyone Can Build Killer Trading Strategies in Python by Andreas Clenow. It will not cover everything under the roof but hopefully, as a community, we will be able to dig deeper. As data gets available easily, retail traders will also be able to afford these strategies.

Prerequisites:

Python knowledge is mandatory. You can also learn it while doing it. You must know how to GOOGLE problems. You might face a lot of issues while running code and creating setups. Google is the saviour.

Sharpe Ratio:
Nobody likes volatility. It would be great if we could get a small gain every single day, and move up in a straight line. But unfortunately, volatility is required to create returns. What we try to achieve is to use as little volatility as we can to pay for the performance. And that’s what a Sharpe Ratio measures. Sharpe Ratio is probably the most widely used and most well-known performance metric. It can be a very useful analytic and it gives you a general idea of the risk-adjusted performance of a strategy. Naturally, you need to go deeper and analyze the details for a proper strategy evaluation, but Sharpe will give you a good overview to start off with.

Picking a backtesting Engine:

There are multiple Python backtesting engines (libraries) like Finmarketpy, Backtrader, etc. I have used the Zipline-Reloaded!

Bundle and Ingest:

A bundle is an interface to import data into Zipline. Zipline stores data in its own preferred format, and it has a good reason for doing so. Zipline is able to read data incrementally, and only hold a part of it in memory at any given time. Then there is the second word, ingest. That word refers to the process of reading data with the help of a bundle and storing it in Zipline’s own format. One needs to ingest a bundle before one can run a backtest. You do this from the terminal, running a command which tells Zipline to use a specific bundle to read data and store it so that it will be ready for use by a backtest.

Data Source (To start):

Currently, I have tried free data available from US markets provided by Nasdaq (QUANDL). It’s a large dataset available from 1990 to 2018. It is good enough to start with Python and strategy building.

Data Source (For Indian Markets):

I need to look for a cheap and reliable data provider. For now, I have found the below data links from Kaggle:

Stock price data of 1700+ NSE stocks and 50+ Indexes from Jan 1990 to June 2021 (NSE India Stock Data (1990 - 2021) | Kaggle)

BSE and NSE data from Jan 2000 to June 2020 (Indian stock market data | Kaggle)

All mutual fund daily NAV data starting from 1st April 2006 (Mutual Fund India | Kaggle)

How to setup?

  1. Install anaconda (https://www.anaconda.com/)
  2. conda create -n z38 python=3.8 # create a new environment with python 3.8 named z38
  3. Conda activate z38 # activate that environment
  4. conda install -y -c conda-forge mamba # Install mamba
  5. mamba install -c ml4t -c conda-forge -c ranaroussi zipline-reloaded # This will install zipline-reloaded package using mamba
  6. Get your API keys after registering a free account with NASDAQ QUANDL. The website provides free US market data for training your model
  7. conda env config vars set QUANDL_API_KEY= mykey # This will set your API key to fetch data from QUANDL
  8. Zipline ingest # This command will fetch your data from QUANDL
  9. conda install -c ml4t pyfolio-reloaded # This library helps us to analyse the results

Sample Codes

The below link has all the codes chapter-wise from the book Trading Evolved. You can run them directly in Python.
All the Codes from the book Trading Evolved

How to use your custom data apart from QUANDL with Zipline?

Please read Chapters 23 and 24 of the book Trading Evolved. It will give you an overview of importing the custom data as a bundle into the zipline library.

Books and Sources:

Beginner Level:
Trading Evolved: Anyone can Build Killer Trading Strategies in Python by Andreas Clenow

Advanced Level:
Machine Learning for Algorithmic Trading by Stefan Jansen

If you’re using M1 Mac then follow the below steps for downloading zipline:

CONDA_SUBDIR=osx-64 conda create -n [environment] # create a new environment
conda activate [environment]
conda env config vars set CONDA_SUBDIR=osx-64 # subsequent commands will use intel packages
conda install -c ml4t -c conda-forge -c ranaroussi zipline-reloaded

This book was an impulsive read for me. I might stop exploring this topic after a while. But I hope this information is helpful to others for starting out.

Update: 1995 to 2017 NSE Bhav copy Files, unadjusted

17 Likes

You could use the yahoo finance library (yfinance · PyPI ) for historical data and current prices as well.

2 Likes

You can take a look at nsepy as well. Gives current and historical data. Challenge is to clean the data.

For basic backtest these will work. For actual rigorous testing you will need to download historical data and clean it up.

I used yfinance but can’t readily remember if they clean the data. Will check that and post.

1 Like