Backtest Framework with Python – Data With Purpose

What you will find in this post

What is a backtest?
Why to backtest before live trading?
Reasons to choose a python framework
Event-Driven or Vectorized
Three frameworks that I tested
- Qf-lib
- Zipline
- QStrader
Reasons for my choice
Comparison Table

What is a backtest?

Backtesting is a fundamental process in quantitative finance, and it plays a vital role in the development and evaluation of trading strategies. At its core, a backtest involves taking a trading strategy, which is essentially a set of rules and criteria for buying and selling assets, and applying it to historical market data. This process allows you to assess how the strategy would have performed in the past.

By simulating the strategy’s actions on historical data, backtesting provides a way to quantify the strategy’s performance and understand its strengths and weaknesses. Traders and investors can use backtests to gain insights into potential profitability, risk levels, and drawdowns. Moreover, it helps in the identification of issues that might not be immediately apparent when solely considering theoretical or paper-trading results. Ultimately, the goal of backtesting is to determine whether a trading strategy has the potential to be profitable in real-market conditions.

One key benefit of backtesting is risk mitigation. It enables traders to uncover vulnerabilities and flaws in their strategies without the risk of losing real capital. By understanding how the strategy would have performed in different market conditions and scenarios, traders can make informed decisions about whether to proceed with live trading. However, it’s important to note that while backtesting provides valuable insights, it is not a guarantee of future performance, as market conditions can change, and past data may not fully represent future conditions. Nevertheless, backtesting remains an essential step in the process of strategy development and refinement in quantitative finance.

Why to backtest before live trading?

Backtesting serves as the cornerstone of prudent decision-making in quantitative finance and algorithmic trading. The primary reason for conducting backtests before implementing a trading strategy in live markets is risk mitigation. In financial markets, there’s no shortage of inherent risks, and deploying a new strategy without proper testing can expose traders and investors to significant financial losses. By conducting comprehensive backtests, you can uncover potential weaknesses in your strategy, which may not be evident through theoretical analysis alone. Identifying and addressing these weaknesses beforehand can help protect your capital and improve your chances of success in real trading.

Another compelling reason to backtest is performance assessment. Backtesting provides a historical perspective on how a trading strategy would have fared in different market conditions. It quantifies various performance metrics, such as returns, volatility, and drawdowns, giving you a clear understanding of the strategy’s historical performance. This information is invaluable for setting realistic expectations and for determining whether a strategy aligns with your financial goals and risk tolerance. Moreover, the insights gained from backtesting allow you to make informed decisions about position sizing, risk management, and overall strategy refinement, which are crucial aspects of successful trading. In essence, backtesting provides the data-driven foundation upon which you can build your trading strategy, increasing the likelihood of consistent and profitable trading in the long run.

Reasons to Choose a Python Framework

Python has emerged as a popular choice for building backtest frameworks in quantitative finance for several compelling reasons. First and foremost, its versatility makes it a top pick among programmers and data scientists alike. Since I have a Computer Science bachelor´s degree, it makes sense for me to go in this direction. Python’s simple and readable syntax, combined with its extensive library ecosystem, facilitates the rapid development of backtesting solutions. This flexibility allows users to focus on strategy development rather than getting bogged down in complex programming details, ultimately reducing time-to-market for trading strategies.

Python’s vast library ecosystem is another major advantage. Libraries like NumPy, Pandas, and Matplotlib provide a comprehensive toolkit for data manipulation, analysis, and visualization. When creating a backtest framework, these libraries can be leveraged to efficiently process historical market data, perform statistical analysis, and generate performance reports. Additionally, the availability of machine learning libraries such as Scikit-Learn and TensorFlow makes it possible to integrate advanced analytics into the backtesting process, enabling the development of more sophisticated and adaptive trading strategies.

Furthermore, Python’s strong community support is a critical factor in its popularity. The Python community is both active and collaborative, with a wealth of resources, forums, and open-source projects dedicated to quantitative finance and algorithmic trading. This support network makes it easier for developers and traders to access expert insights, resolve issues, and stay updated with the latest developments in the field. The combination of Python’s versatility, libraries, and community support positions it as a powerful and accessible choice for building backtest frameworks in the world of quantitative finance.

Event-Driven or Vectorized

In the realm of backtesting, one of the fundamental decisions you’ll face is choosing between an event-driven or a vectorized approach. Each approach has its unique characteristics, and the choice depends on the specific requirements of your trading strategy and your preferences.

The event-driven approach involves simulating trading decisions in a step-by-step manner, much like a trader would make real-time decisions based on incoming market data. This approach excels in handling complex strategies with conditional rules and dependencies. It allows for a high degree of customization, making it suitable for strategies that require real-time data processing, such as algorithmic trading in liquid markets. Event-driven backtesting also provides the capability to implement risk management features dynamically, which is crucial in real trading scenarios where market conditions can change rapidly.

On the other hand, the vectorized approach processes data in larger batches or vectors, typically using libraries like NumPy, to execute trading strategies. This approach can provide significant performance advantages, especially when dealing with extensive historical datasets. It’s ideal for scenarios where your strategy doesn’t require real-time decision-making or relies on simpler, rule-based logic. While it may not offer the same level of customization as event-driven backtesting, it can be highly efficient when evaluating strategies that focus on historical data analysis, portfolio rebalancing, and performance measurement.

I choose the event-driven approach because of the following reasons:

Helps do avoid data snooping bias
It is the approach used by most quantitative funds
It is a more realistic approach
Higher degree of customization and trade management
Code-reuse
Better suited for live trading

Three frameworks that I tested

I tested the 3 backtesting frameworks by creating a conda environment for each, installing and running basically the examples provided. Sometimes applying small changes to the strategies or chaging the data on which the backtested was conducted. This way I could get a “feel” for each framework.

My goal was to find the one most suited to my needs and the one that I felt more confortable in using. As you will see bellow, the reports they generated are very similar.

Qf-lib

Source: https://quarkfin.github.io/qf-lib-info/

Qf-lib is a Python library developed at CERN. It is a backtesting framework for developing investment strategies. The Backtester uses an event-driven architecture and simulates events such as daily market opening or closing. It is designed to test and evaluate any custom investment strategy.

It provides various tools for portfolio construction, time series analysis, and risk monitoring. For some reason it only worked with 1-minute data and daily data. It is a very complete tool for backtesting and result analysis.

Zipline

Source: https://github.com/quantopian/zipline

Zipline is an open-source backtesting library developed by Quantopian, a financial technology company. The company closed in 2020, but the library is still available on GitHub and in active use.

Zipline is specifically designed for algorithmic trading and quantitative finance research. It provides a framework for simulating trading strategies using historical market data to assess their performance.

It is highly customizable, allowing users to define their trading algorithms and strategies. Additionally, it can be extended with custom data feeds, risk management logic, and execution models.

To me the focus of zipline is more towards long/short strategies. I did not like it´s structure very much. The “bundle” process to make it see the data takes some time to make it work properly.

It works well with Alphalens which helps assess the effectiveness of alpha factors and Pyfolio to evaluate the performance of trading strategies and portfolios. It provides a range of features, including detailed performance statistics, risk assessment, and interactive visualizations, making it a valuable tool for quantitatively assessing the results of a portfolio over time.

Report generated by Pyfolio on Zipline backtest

QStrader

Source: https://www.quantstart.com/qstrader/

QStrader is a Python-based open-source trading and backtesting framework specifically designed for algorithmic trading and quantitative finance. Developed by QuantStart, this framework offers a versatile and customizable platform for traders and developers to create, test, and deploy trading strategies. QStrader providing flexibility in strategy development. Its modular design allows users to extend functionality as needed, and it is known for its compatibility with multiple data sources, brokers, and execution interfaces. QStrader also provides a range of built-in features for performance measurement, risk management, and order handling, making it a comprehensive solution for those looking to develop and test algorithmic trading strategies in Python.

Reasons for my choice

After testing the above backtest frameworks, my choice was QStrader.

Simplicity – QStrader simplicity and clear structure of modules was easier for me to understand than the other 2 frameworks. It follows basically this blog post on quantstart.
Retail trader focused – is it an institutional-style quantitative trading framework, with an emphasis on portfolio construction and risk management, but the content you find in quantstart seems to be directed on retail traders or people that want to learn about quantitative trading. I am in this category.
Modular Design – The framework is designed in a modular fashion, allowing you to easily extend its functionality. Since I want to develop and mess around with the functionalities, as a trading and as a learning experience, to me this was an important factor.
Code documentation – The code is well documented, despite the fact that the documentation of each module could be improved.
Community Size: it has an active community, it might not be as extensive as some other popular libraries. This is something you should consider. By looking at the github page, there seems to be good and fast responses to questions there.

Comparison Table

Feature	Qf-lib	Zipline	QStrader
Programming Language	Python	Python	Python
Backtesting Approach	Event-Driven	Event-Driven	Event-Driven
Open Source	Yes	Yes	Yes
Customizable	Highly	Highly	Highly
Modular Design	Yes	No	Yes
Integration with Data Sources	Yes	Limited	Yes
Integration with Brokers	Yes	Limited	Yes
Reports	Complete	Moderate	Moderate
Learning curve	Moderate	Beginner-friendly	Moderate

Hope this post can help yoy guide your choices better, but in the end, the best approach is to test them for yourself and identify which one fits your needs and goals better. If you have any questions leave a comment bellow and I will get back to you as soon as possible!