Hacker’s Tools

For our financial hacking experiments (and for harvesting their financial fruits) we need some software machinery for research, testing, training, and live trading financial algorithms. There are many tools for algo trading, but no existing software platform today is really up to all those tasks. You have to put together your system from different software packages. Fortunately, two are normally sufficient. I’ll use Zorro and R for most articles on this blog, but will also occasionally look into other tools.

Choice of languages

Algo trading systems are normally based on a script in some programming language. You can avoid writing scripts entirely by using a visual ‘strategy builder’, ‘code wizard’ or spreadsheet program for defining your strategy. But this is also some sort of programming, just in a different language that you have to master. And visual builders can only create rather simple ‘indicator soup’ systems that are unlikely to produce consistent trading profit. For serious algo trading sytems, real development, and real research, there’s no stepping around ‘real programming’.

You’re also not free to select the programming language with the nicest or easiest syntax. One of the best compromises of simplicity and object orientation is probably Python. It also offers libraries with useful statistics and indicator functions. Consequently, many strategy developers start with programming their systems in Python… and soon run into serious limitations. There’s another criterion that is more relevant for system development than syntax: execution speed.

Speed mostly depends on whether a computer language is compiled or interpreted. C, Pascal, and Java are compiled languages, meaning that the code runs directly on the processor (C, C++, Pascal) or on a ‘virtual machine’ (Java). Python, R, and Matlab is interpreted: The code won’t run by itself, but is executed by an interpreter software. Interpreted languages are much slower and need more CPU and memory resources than compiled languages. It won’t help much for trading strategies that they have fast C-programmed libraries. All backtests or optimization processes must still run through the bottleneck of interpreted trading logic. Theoretically the slowness can be worked around with ‘vectorized coding’ – see below – but that has little practical use.

R and Python have other advantages. They are interactive: you can enter commands directly at a console. This allows quick code or function testing. Some languages, such as C#, are inbetween: They are compiled to a machine-independent interim code that is then, dependent on implementation, either interpreted or converted to machine code. C# is about 4 times slower than C, but still 30 times faster than Python.

Here’s a benchmark table of the same two test programs written in several languages: a sudoku solver and a loop with a 1000 x 1000 matrix multiplication (in seconds):

Language Sudoku Matrix
C, C++ 1.0 1.8
Java 1.7 2.6
Pascal 4
C# 3.8 9
JavaScript 18.1 16
Basic (VBA) 25
Erlang 18 31
Python 119 121
Ruby 98 628
Matlab 621
R 1738

Speed becomes important as soon as you want to develop a short-term trading system. In the development process you’re all the time testing system variants. A 10-years backtest with M1 historical data executes the strategy about 3 million times. If a C-written strategy needs 1 minute for this, the same strategy in EasyLanguage would need about 30 minutes, in Python 2 hours, and in R more than 10 hours! And that’s only a backtest, no optimization or WFO run. If I had coded the trend experiment in Python or R, I would today still wait for the results. You can see why trade platforms normally use a C variant or a proprietary compiled language for their strategies. HFT systems are anyway written in C or directly in machine language.

Even compiled languages can have large speed differences due to different implementation of trading and analysis functions. When we compare not Sudoku or a matrix multiplication, but a real trading system – the small RSI strategy from this page – we find very different speeds on different trading platforms (10 years backtest, ticks resolution):

  • Zorro: ~ 0.5 seconds (compiled C)
  • MT4:  ~ 110 seconds (MQL4, a C variant)
  • MultiCharts: ~ 155 seconds (EasyLanguage, a C/Pascal mix)

However, the differences are not as bad as suggested by the benchmark table. In most cases the slow language speed is partically compensated by fast vector function libraries. A script that does not go step by step through historical data, but only calls library functions that process all data simultaneously, would run with comparable speed in all languages. Indeed some trading systems can be coded in this vectorized method, but unfortunately this works only with simple systems and requires entirely different scripts for backtests and live trading.

Choice of tools

Zorro is a software for financial analysis and algo-trading – a sort of Swiss Knife tool since you can use it not only for live trading, but also for all sorts of quick tests. It’s my software of choice for financial hacking because:

  • It’s free (unless you’re rich).
  • Scripts are in C, event driven and very fast. You can code a system or an idea in 5 minutes.
  • Open architecture – you can add anything with DLL plugins.
  • Minimalistic – just a frontend to a programming language.
  • Can be automatized for experiments.
  • Very stable – I rarely found bugs and they were fixed fast.
  • Very accurate, realistic trading simulation, including HFT.
  • Supports also options and futures, and portfolios of multiple assets.
  • Has a library with 100s of indicators, statistics and machine learning functions, most with source code.
  • Is continuously developed and supported (new versions usually come out every 2..3 months).
  • Last but not least: I know it quite well, as I’ve written its tutorial…

Zorro

A strategy example coded in C, the classic SMA crossover:

void run()
{
  double* Close = series(priceClose());
  double* MA30 = series(SMA(Close,30));	
  double* MA100 = series(SMA(Close,100));
	
  Stop = 4*ATR(100);
  if(crossOver(MA30,MA100))
    enterLong();
  if(crossUnder(MA30,MA100))
    enterShort();
}

More code can be found among the script examples on the Zorro website. You can see that Zorro offers a relatively easy trading implementation. But here comes the drawback of the C language: You can not as easy drop in external libraries as in Python or R. Using a C/C++ based data analysis or machine learning package involves sometimes a lengthy implementation. Fortunately, Zorro can also call R and Python functions for those purposes.

R is a script interpreter for data analysis and charting. It is not a real language with consistent syntax, but more a conglomerate of operators and data structures that has grown over 20 years. It’s harder to learn than a normal computer language, but offers some unique advantages. I’ll use it in this blog when it comes to complex analysis or machine learning tasks. It’s my tool of choice for financial hacking because:

  • It’s free. (“Software is like sex: it’s better when it’s free.”)
  • R scripts can be very short and effective (once you got used to the syntax).
  • It’s the global standard for data analysis and machine learning.
  • Open architecture – you can add modules for almost anything.
  • Minimalistic – just a console with a language interpreter.
  • Very stable – I found a few bugs in external libraries, but so far never in the main program.
  • Has tons of “packages” for all imaginable mathematical and statistical tasks, and especially for machine learning.
  • Is continuously developed and supported by the global scientific community (about 15 new packages usually come out every day).

r

This is the SMA crossover in R for a ‘vectorized’ backtest:

require(quantmod)
require(PerformanceAnalytics)

Data <- xts(read.zoo("EURUSD.csv", tz="UTC", format="%Y-%m-%d %H:%M", sep=",", header=TRUE))
Close <- Cl(Data)
MA30 <- SMA(Close,30)
MA100 <- SMA(Close,100)
 
Dir <- ifelse(MA30 > MA100,1,-1) # calculate trade direction
Dir.1 <- c(NA,Dir[-length(Dir)]) # shift by 1 for avoiding peeking bias
Return <- ROC(Close)*Dir.1 
charts.PerformanceSummary(na.omit(Return))

You can see that the vectorized code just consists of function calls. It runs almost as fast as the C equivalent. But it is difficult to read, it can not be used for live trading, and many parts of a trading logic – even a simple stop loss – cannot be coded for a vectorized test. Thus, so good R is for interactive data analysis, so hopeless is it for writing trade strategies – although some R packages (for instance, quantstrat) even offer rudimentary optimization and test functions. They all require an awkward coding style and do not simulate trading very realistically, but are still too slow for serious backtests.

Although R can not replace a serious backtest and trading platform, Zorro and R complement each other perfectly: Here is an example of a machine learning system build together with a deep learning package from R and the training and trading framework from Zorro.

More hacker’s tools

Aside from languages and platforms, you’ll often need auxiliary tools that may be small, simple, cheap, but all the more important since you’re using them all the time. For editing not only scripts, but even short CSV lists I use Notepad++. For interactive working with R I recommend RStudio. Very helpful for strategy development is a file comparison tool: You often have to compare trade logs of different system variants and check which variant opened which trade a little earlier or later, and which consequences that had. For this I use Beyond Compare.

Aside from Zorro and R, there’s also a relatively new system development software that I plan to examine closer at some time in the future, TSSB for generating and testing bias-free trading systems with advanced machine learning algorithms. David Aronson and Timothy Masters were involved in its development, so it certainly won’t be as useless as most other “trade system generating” software. However, there’s again a limitation: TSSB can not trade or export, so you can not really use the ingenious systems that you developed with it. Maybe I’ll find a solution to combine TSSB with Zorro.

References

TIOBE index of top programming languages

Speed comparison of programming languages


Update (November 2017). The release of new deep learning packages has made TSSB sort of obsolete. For instance, the H2O package natively supports several ways of features filtering and dimensionality reduction, as well as ensembles, both so far the strength of TSSB. H2O is supported with Zorro’s advise function. Still, the TSSB book by Davin Aronson is a valuable source of methods, approaches, and tips about machine learning for financial prediction.

Download links to the latest versions of Zorro and R are placed on the side bar. A brief tutorial to both Zorro an R is contained in the Zorro manual; a more comprehensive introduction into working with Zorro can be found in the Black Book.

19 thoughts on “Hacker’s Tools”

  1. Hi Johanne Christian.
    It is always interesting to read about your exploits. It seems, you have a real passion for the sience of algo trading and it’s contagious 🙂 Keep going!
    From the picture of your Zorro interface above, one can tell that it’s connected to IB. How did you do it?
    Thank for the great work and
    good luck!

  2. Yes, the IB plugin is in development and I have already a test version. It currently works for Forex, but not yet for stocks.

  3. Have you again looked at TSSB? I am biting my teeth out on it, but somehow not making enough progress.
    Thanks, Brian

  4. Not yet, but it’s on my todo list. Unfortunately the list is quite long at the moment, so it might take a while.

  5. Well, Zorro is not exactly free. It’s free unless you have more than 7000 US$ in your account.

  6. I am sceptical about this software. Nothing in this world comes free and this Zorro software looks too good to be true. If they want to give a software free then why don’t they open source it and host it in github.. Any way nice article

  7. Neat advice,

    Luckily now, there’s more room to use the software tools with which you are most comfortable. For example, on Python you can run on PyPy or Cython (which speeds scripts up to C/C++ levels). There is also a wide array of packages for statistical analysis, financial modelling and artificial intelligence usage.

    All this comes in handy when you can leverage on your proficiency to write fine trading systems quickly with what you. Which beats learning a new language altogether!

  8. Don’t get me wrong, I like your blog and your articles are very interesting, however, I must say I don’t share the same enthusiasm for Zorro platform. I download it and tried it as well as its website. It’s all weird. It seems pretty fast but buggy. If you want to trade seriously you have to buy some license ( no problem ) but according to my targets working with IB, it doesn’t apply. I don’t know, I don’t trust it to put any logic in there.

  9. Just out of interest: “my targets working with IB, it doesn’t apply” – what targets do you mean and what does not apply?

  10. This is the right site for anyone who wants to understand this topic.

    You realize a whole lot its almost tough to argue
    with you (not that I personally will need to…HaHa).
    You definitely put a brand new spin on a subject which
    has been written about for decades. Great stuff, just excellent!

  11. I would be very interested in knowing how Cython and especially Pypy stack up in your benchmarks above.

  12. I haven’t tested them, but suppose they are somewhere between C# and Javascript in execution speed. They are certainly faster than interpreted Python, but slower than C-like languages due to their additional overhead in data structures.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.