Arbitrage – The Financial Hacker

Hacking a HFT system

jcl — Sat, 08 Jul 2017 13:26:30 +0000

Compared with machine learning or signal processing algorithms of conventional algo trading strategies, High Frequency Trading systems can be surprisingly simple. They need not attempt to predict future prices. They know the future prices already. Or rather, they know the prices that lie in the future for other, slower market participants. Recently we got some contracts for simulating HFT systems in order to determine their potential profit and maximum latency. This article is about testing HFT systems the hacker’s way.

The HFT advantage is receiving price quotes earlier and getting orders filled faster than the majority of market participants. Its profit depends on the system’s latency, the delay between price quote and subsequent order execution at the exchange. Latency is the most relevant factor of a HFT system. It can be optimized in two ways: by minimizing the distance to the exchange, and by maximizing the speed of the trading system. The former is more important than the latter.

The location

Ideally, a HFT server is located directly at the exchange. And indeed most exchanges are dear selling server space in their cellars – the closer to the main network hub, the dearer the space. Electrical signals in a shielded wire travel with about 0.7 .. 0.9 times the speed of light (300 km/ms). 1 m closer to the signal source means a whopping 8 nanoseconds round-trip time advantage. How many trade opportunities disappear after 8 ns? I don’t know, but people are willing to pay for any saved nanosecond.

Unfortunately (or fortunately, from a cost perspective), the HFT system that we’ll examine here can not draw advantage from colocating at the exchange. For reasons that we’ll see soon, it must trade and receive quotes from the NYSE and the CME simultaneously. There’s a high speed cable and also a microwave radio link running between both cities. The theoretically ideal location of such a HFT system is Warren, Ohio, exactly halfway between New York City and Chicago. I don’t know whether there’s a high speed node in Warren, but if it is, the distance of 357 miles to both exchanges would be equivalent to about 4 ms round-trip latency.

Warren, Ohio – HFT Mecca of the world (image by Jack Pearce / Wikipedia Commons)

Without doubt, a server in this pleasant town would come at lower cost than a server in the cellar of the New York Stock Exchange. My today’s free tip for getting rich: Buy some garage space in Warren, tap into the high speed New York-Chicago cable or radio link, and rent out server racks!

The software

When you’ve already invested money for the optimal location and connection of the HFT system, you’ll also want an algo trading software that matches the required speed. Commercial trading platforms – even Zorro – are normally not fast enough. And you don’t know what they do behind your back. HFT systems are therefore normally not based on a universal trading platform, but directly coded. Not in R or Python, but in a fast language, usually one of the following:

C or C++. Good combination of high level and high speed. C is easy to read, but very effective and the fastest language of all – almost as fast as machine code.
Pentium Assembler. Write your algorithm in machine instructions. Faster than even C when the algorithm mostly runs in loops that can be manually optimized. There are specialists for optimizing assembler code. Disadvantage: Any programmer has a hard time to understand assembler programs written by another programmer.
CUDA, HLSL, or GPU assembler. Run your algorithm on the shader pipeline of the PC’s video card. This makes sense when it is heavily based on vector and matrix operations.
VHDL. If any software would be too slow and trade success really depends on nanoseconds, the ultimate solution is coding the system in hardware. In VHDL you can define arithmetic units, digital filters, and sequencers on a FPGA (Field Programmable Gate Array) chip with a clock rate up to several 100 MHz. The chip can be directly connected to the network interface.

With exception of VHDL, especially programmers of 3D computer games are normally familar with those high-speed languages. But the standard language of HFT systems is plain C/C++. It is also used in this article.

The trading algorithm

Many HFT systems prey on traders by using fore-running methods. They catch your quote, buy the same asset some microseconds earlier, and sell it to you with a profit. Some exchanges prevent this in the interest of fair play, while other exchanges encourage this in the interest of earning more fees. For this article we won’t use fore-running methods, but simply exploit arbitrage between ES and SPY. We’re assuming that our server is located in Warren, Ohio, and has a high speed connection to Chicago and to New York.

ES is a Chicago traded S&P500 future, exposed to supply and demand. SPY is a New York traded ETF, issued by State Street in Boston and following the S&P500 index (minus State Street’s fees). 1 ES Point is equivalent to 10 SPY cents, so the ES price is about ten times the SPY price. Since both assets are based on the same index, we can expect that their prices are highly correlated. There have been some publications (1) that claim this correlation will break down at low time frames, causing one asset trailing the other. We’ll check if this is true. Any shortlived ES-SPY price difference that exceeds the bid-ask spreads and trading costs constitutes an arbitrage opportunity. Our arbitrage algorithm would work this way:

Determine the SPY-ES difference.
Determine its deviation from the mean.
If the deviation exceeds the ask-bid spreads plus a threshold, open positions in ES and SPY in opposite directions.
If the deviation reverses its sign and exceeds the ask-bid spreads plus a smaller threshold, close the positions.

This is the barebone HFT algorithm in C. If you’ve never seen HFT code before, it might look a bit strange:

#define THRESHOLD  0.4  // Entry/Exit threshold 

// HFT arbitrage algorithm
// returns 0 for closing all positions
// returns 1 for opening ES long, SPY short
// returns 2 for opening ES short, SPY long
// returns -1 otherwise
int tradeHFT(double AskSPY,double BidSPY,double AskES,double BidES)
{
	double SpreadSPY = AskSPY-BidSPY, SpreadES = AskES-BidES;
	double Arbitrage = 0.5*(AskSPY+BidSPY-AskES-BidES);

	static double ArbMean = Arbitrage;	
	ArbMean = 0.999*ArbMean + 0.001*Arbitrage;
	static double Deviation = 0;
	Deviation = 0.75*Deviation + 0.25*(Arbitrage - ArbMean);

	static int Position = 0;	
	if(Position == 0) {
		if(Deviation > SpreadSPY+THRESHOLD)
			return Position = 1;
		if(-Deviation > SpreadES+THRESHOLD)
			return Position = 2;
	} else {
		if(Position == 1 && -Deviation > SpreadES+THRESHOLD/2)
			return Position = 0;
		if(Position == 2 && Deviation > SpreadSPY+THRESHOLD/2)
			return Position = 0;
	}
	return -1;	
}

The tradeHFT function is called from some framework – not shown here – that gets the price quotes and sends the trade orders. Parameters are the current best ask and bid prices of ES and SPY from the top of the order book (we assume here that the SPY price is multiplied with 10 so that both assets are on the same scale). The function returns a code that tells the framework whether to open or close opposite positions or to do nothing. The Arbitrage variable is the mid price difference of SPY and ES. Its mean (ArbMean) is filtered by a slow EMA, and the Deviation from the mean is also slightly filtered by a fast EMA to prevent reactions on outlier quotes. The Position variable constitutes a tiny state machine with the 3 states long, short, and nothing. The entry/exit Threshold is here set to 40 cents, equivalent to a bad SPY spread multiplied with 10. This is the only adjustable parameter of the system. If we wanted to really trade it, we had to optimize the threshold using several months’ ES and SPY data.

This minimalistic system would be quite easy to convert to Pentium assembler or even to a FPGA chip. But it is not really necessary: Even compiled with Zorro’s lite-C compiler, the tradeHFT function executes in just about 750 nanoseconds. A strongly optimizing C compiler, like Microsoft VC++, gets the execution time down to 650 nanoseconds. Since the time span between two ES quotes is normally 10 microseconds or more, the speed of C is largely sufficient.

Our HFT experiment has to answer two questions. First, are those price differences big enough for arbitrage profit? And second, at which maximum latency will the system still work?

The data

For backtesting a HFT system, normal price data that you can freely download from brokers won’t do. You have to buy high resolution order book or BBO data (Best Bid and Offer) that includes the exchange time stamps. Without knowing the exact time when the price quote was received at the exchange, it is not possible to determine the maximum latency.

Some companies are recording all quotes from all exchanges and are selling this data. Any one has its specific data format, so the first challenge is to convert this to lean data files that we then evaluate with our simulation software. We’re using this very simple target format for price quotes:

typedef struct T1    // single tick
{
	double time; // time stamp, OLE DATE format
	float fVal;  // positive = ask price, negative = bid price	
} T1;

The company watching the Chicago Mercantile Exchange delivers its data in a specific CSV format with many additional fields, of which most we don’t need here (f.i. the trade volume or the quote arrival time). Every day’s quotes are stored in one CSV file. This is the Zorro script for pulling the ES December 2016 contract out of it and converting it to a dataset of T1 price quotes:

//////////////////////////////////////////////////////
// Convert price history from Nanotick BBO to .t1
//////////////////////////////////////////////////////

#define STARTDAY 20161004
#define ENDDAY   20161014

string InName = "History\\CME.%08d-%08d.E.BBO-C.310.ES.csv";  // name of a day file
string OutName = "History\\ES_201610.t1";
string Code = "ESZ";	// December contract symbol

string Format = "2,,%Y%m%d,%H:%M:%S,,,s,,,s,i,,";  // Nanotick csv format
void main()
{
	int N,Row,Record,Records;
	for(N = STARTDAY; N <= ENDDAY; N++)
	{
		string FileName = strf(InName,N,N+1);
		if(!file_date(FileName)) continue;
		Records = dataParse(1,Format,FileName);  // read BBO data
		printf("\n%d rows read",Records);
		dataNew(2,Records,2);  // create T1 dataset
		for(Record = 0,Row = 0; Record < Records; Record++)
		{
			if(!strstr(Code,dataStr(1,Record,1))) continue; // select only records with correct symbol
			T1* t1 = dataStr(2,Row,0);  // store record in T1 format
			float Price = 0.01 * dataInt(1,Record,3);  // price in cents
			if(Price < 1000) continue;  // no valid price
			string AskBid = dataStr(1,Record,2);
			if(AskBid[0] == 'B')  // negative price for Bid
				Price = -Price;
			t1->fVal = Price;
			t1->time = dataVar(1,Record,0) + 1./24.;  // add 1 hour Chicago-NY time difference
			Row++;
		}
		printf(", %d stored",Row);
		dataAppend(3,2,0,Row);  // append dataset
		if(!wait(0)) return;
	}
	dataSave(3,OutName);  // store complete dataset
}

We’re converting here 10 days in October 2016 for our backtest. The script first parses the CSV file into an intermediary binary dataset, which is then converted to the T1 target format. Since the time stamps are in Chicago local time, we have to add one hour to convert them to NY time.

The company watching the New York Stock exchange delivers data in a highly compressed specific format named “NxCore Tape”. We’re using the Zorro NxCore plugin for converting this to another T1 list:

//////////////////////////////////////////////////////
// Convert price history from Nanex .nx2 to .t1
//////////////////////////////////////////////////////

#define STARTDAY 20161004
#define ENDDAY 	 20161014
#define BUFFER	 10000

string InName = "History\\%8d.GS.nx2";  // name of a single day tape
string OutName = "History\\SPY_201610.t1";
string Code = "eSPY";

int Row,Rows;

typedef struct QUOTE {
	char	Name[24];
	var	Time,Price,Size;
} QUOTE;

int callback(QUOTE *Quote)
{
	if(!strstr(Quote->Name,Code)) return 1;
	T1* t1 = dataStr(1,Row,0);  // store record in T1 format
	t1->time = Quote->Time;
	t1->fVal = Quote->Price;
	Row++; Rows++;
	if(Row >= BUFFER)	{   // dataset full?
		Row = 0;
		dataAppend(2,1);    // append to dataset 2
	}
	return 1;
}


void main()
{
	dataNew(1,BUFFER,2); // create a small dataset
	login(1);            // open the NxCore plugin
	
	int N;
	for(N = STARTDAY; N <= ENDDAY; N++) {
		string FileName = strf(InName,N);
		if(!file_date(FileName)) continue;
		printf("\n%s..",FileName);
		Row = Rows = 0;  // initialize global variables
		brokerCommand(SET_HISTORY,FileName); // parse the tape
		dataAppend(2,1,0,Row);  // append the rest to dataset 2
		printf("\n%d rows stored",Rows);
		if(!wait(0)) return;  // abort when [Stop] was hit
	}
	dataSave(2,OutName); // store complete dataset
}

The callback function is called by any quote in the tape file. We don’t need most of the data, so we filter out only the SPY quotes (“eSPY”).

Verifying the inefficiency

With data from both sources, we can now compare the ES and SPY prices in high resolution. Here’s a typical 10 seconds sample from the price curves:

SPY (black) vs. ES (red), October 5, 2017, 10:01:25 – 10:01.35

The resolution is 1 ms. ES is plotted in $ units, SPY in 10 cents units, with a small offset so that the curves lie over each other. The prices shown are ask prices. We can see that ES moves in steps of 25 cents, SPY in steps of 1 cent. The prices are still well correlated even on a millisecond scale. ES appears to trail a tiny bit.

We can also see an arbitrage opportunity at the steep price step in the center at about 10:01:30. ES reacted a bit slower, but stronger on some event, probably a moderate price jump of one of the stocks of the S&P 500. This event also triggered a fast sequence of oscillating SPY quotes, most likely from other HFT systems, until the situation calmed down again a few 100 ms later (the scale is not linear since time periods with no quotes are skipped). For several milliseconds the ES-SPY difference exceeded the ask-bid spread of both assets (usually 25 cents for ES and 1..4 cents for SPY). We would here ideally sell ES and buy SPY immediately after the ES price step. So we have verified that the theoretized inefficiency does really exist, at least in this sample.

The script for displaying high resolution charts:

#define ES_HISTORY	"ES_201610.t1"
#define SPY_HISTORY	"SPY_201610.t1"
#define TIMEFORMAT	"%Y%m%d %H:%M:%S"
#define FACTOR		10
#define OFFSET		3.575

void main()
{
	var StartTime = wdatef(TIMEFORMAT,"20161005 10:01:25"),
		EndTime = wdatef(TIMEFORMAT,"20161005 10:01:35");
	MaxBars = 10000;
	BarPeriod = 0.001/60.;	// 1 ms plot resolution
	Outlier = 1.002;  // filter out 0.2% outliers

	assetList("HFT.csv");
	dataLoad(1,ES_HISTORY,2);
	dataLoad(2,SPY_HISTORY,2);
	int RowES=0, RowSPY=0;
	
	while(Bar < MaxBars)
	{
		var TimeES = dataVar(1,RowES,0), 
			PriceES = dataVar(1,RowES,1), 
			TimeSPY = dataVar(2,RowSPY,0), 
			PriceSPY = dataVar(2,RowSPY,1);

		if(TimeES < TimeSPY) RowES++;
		else RowSPY++;

		if(min(TimeES,TimeSPY) < StartTime) continue;
		if(max(TimeES,TimeSPY) > EndTime) break;

		if(TimeES < TimeSPY) {
			asset("ES");
			priceQuote(TimeES,PriceES);
		} else {
			asset("SPY");
			priceQuote(TimeSPY,PriceSPY);
		}
		
		asset("ES");
		if(AssetBar > 0) plot("ES",AskPrice+OFFSET,LINE,RED);
		asset("SPY");
		if(AssetBar > 0) plot("SPY",FACTOR*AskPrice,LINE,BLACK);
	}
}

The script first reads the two historical data files that we’ve created above, and then parses them row by row. For keeping the ES and SPY quotes in sync, we always read the quote with the smaller timestamp from the datasets (the quotes are stored in ascending timestamp order). The priceQuote function checks the prices for outliers, stores the ask price in the AskPrice variable and the ask-bid difference in Spread, and increases the Bar count for plotting the price curves. A bar of the curve is equivalent to 1 ms. The AssetBar variable is the last bar with a price quote of that asset, and is used here to prevent plotting before the first quote arrived.

Testing the system

For backtesting our HFT system, we only need to modify the script above a bit, and call the tradeHFT function in the loop:

#define LATENCY		4.0	// milliseconds

function main()
{
	var StartTime = wdatef(TIMEFORMAT,"20161005 09:30:00"),
		EndTime = wdatef(TIMEFORMAT,"20161005 15:30:00");
	MaxBars = 200000;
	BarPeriod = 0.1/60.;	// 100 ms bars 
	Outlier = 1.002;

	assetList("HFT.csv");
	dataLoad(1,ES_HISTORY,2);
	dataLoad(2,SPY_HISTORY,2);
	int RowES=0, RowSPY=0;
	
	EntryDelay = LATENCY/1000.;
	Hedge = 2;
	Fill = 8; // HFT fill mode;
	Slippage = 0;
	Lots = 100;
		
	while(Bar < MaxBars)
	{
		var TimeES = dataVar(1,RowES,0), 
			PriceES = dataVar(1,RowES,1),
			TimeSPY = dataVar(2,RowSPY,0),
			PriceSPY = dataVar(2,RowSPY,1);

		if(TimeES < TimeSPY) RowES++;
		else RowSPY++;

		if(min(TimeES,TimeSPY) < StartTime) continue;
		if(max(TimeES,TimeSPY) > EndTime) break;

		if(TimeES < TimeSPY) {
			asset("ES");
			priceQuote(TimeES,PriceES);
		} else {
			asset("SPY");
			priceQuote(TimeSPY,FACTOR*PriceSPY);
		}
		
		asset("ES");
		if(!AssetBar) continue;
		var AskES = AskPrice, BidES = AskPrice-Spread;
		asset("SPY");
		if(!AssetBar) continue;
		var AskSPY = AskPrice, BidSPY = AskPrice-Spread;

		int Order = tradeHFT(AskSPY,BidSPY,AskES,BidES);	
		switch(Order) {
			case 1: 
			asset("ES"); enterLong();
			asset("SPY"); enterShort();
			break;
		
			case 2: 
			asset("ES"); enterShort();
			asset("SPY"); enterLong();
			break;
		
			case 0:
			asset("ES"); exitLong(); exitShort();
			asset("SPY"); exitLong(); exitShort();
			break;
		}
	}
	printf("\nProfit %.2f at NY Time %s",
		Equity,strdate(TIMEFORMAT,dataVar(1,RowES,0)));
}

This script runs a HFT backtest of one trading day, from 9:30 until 15:30 New York time. We can see that it just calls the HFT function with the ES and SPY prices, then executes the returned code in the switch-case statement. It opens 100 units of each asset (equivalent to 2 ES and 1000 SPY contracts). The round-trip latency is set up with the EntryDelay variable. In HFT mode (Fill = 8), a trade is filled at the most recent price quote after the given delay. This ensures a realistic latency simulation when price quotes are entered with their exchange time stamps.

Here’s the HFT profit at the end of the day with different round-trip latency values:

LATENCY	0.5 ms	4.0 ms	6.0 ms	10 ms
Profit / day	+ $793	+ $273	+ $205	– $15

The ES-SPY HFT arbitrage system makes about $800 each day with an unrealistic small latency of 500 microseconds. Unfortunately, due to the 700 miles between the NYSE and the CME, you’d need a time machine for that (or some faster-than-light method of quantum teleportation). A HFT server in Warren, Ohio, at 4 ms latency would generate about $300 per day. A server slightly off the direct NY-Chicago line can still grind out $200 daily. But the system deteriorates quickly when located further away, as with a server in Nashville, Tennessee, at around 10 ms latency. This is a strong hint that some high-speed systems in the proximity of both exchanges are already busy with exploiting ES-SPY arbitrage.

Still, $300 per day result in a decent $75,000 annual income. However, what needed you to invest for that, aside from hardware and software? With SPY at $250, the 100 units translate to 100*$2500 + 100*10*$250 = half a million dollars trade volume. So you would get only 15% annual return on your investment. But you can improve this by adding more pairs of NY ETFs and their equivalent CME futures to the arbitrage strategy. And you can improve it further when you find a broker or similar service that can receive your orders directly at the exchanges. Due to the ETF/future hedging the position is almost without risk. So you could probably negotiate a large leverage. And also a flat monthly fee, since broker commissions were not included in the test.

Conclusion

When systems react fast enough, profits can be achieved with very primitive methods, such as arbitrage between highly correlated assets.
Location has a large impact on HFT profits.
ES-SPY arbitrage cannot be traded by everyone from everywhere. You’re competing with people that are doing this already. Possibly in Warren, Ohio.

I’ve added the scripts and the asset list to the 2017 script archive in the “HFT” and “History” folders. Unfortunately I could not add the ES / SPY price history files, since I do not own the data. For reproducing the results, get BBO history from Nanex or Nanotick – their data can be read with the scripts above. You’ll also need Zorro S version 1.60 or above, which supports HFT fill mode.

References

(1) When Correlations Break Down (Jared Bernstein 2014)

(2) Financial Programming Greatly Accelerated (Cousin/Weston, Automated Trader 42/2017)

Addendum. I have been asked how the profit would be affected when the server is located in New York, with 0.25 ms signal delay to the NYSE and 4.8 ms signal delay to the CME. For simulating this, modify the script:

var TimeES = dataVar(1,RowES,0)+4.8/(1000*60*60*24),
  TimeSPY = dataVar(2,RowSPY,0)+0.25/(1000*60*60*24),
...
switch(Order) {
	case 1: 
		EntryDelay = 4.8/1000;
		asset("ES"); enterLong();
		EntryDelay = 0.25/1000;
		asset("SPY"); enterShort();
		break;
		
	case 2: 
		EntryDelay = 4.8/1000;
		asset("ES"); enterShort();
		EntryDelay = 0.25/1000;
		asset("SPY"); enterLong();
		break;
		
	case 0:
		EntryDelay = 4.8/1000;
		asset("ES"); exitLong(); exitShort();
		EntryDelay = 0.25/1000;
		asset("SPY"); exitLong(); exitShort();
		break;
}

The first line simulates the price quote arrivals with 4.8 and 0.25 ms delay, the other lines establish the different order delays for SPY and ES. Alternatively, you could artificially delay the incoming SPY quotes by another 4.55 ms, so that their time stamps again match the ES quotes. In both cases the system returns about $240 per day, almost as much as in Warren. However a similar system located in Aurora, close to Chicago (swap the delays in the script), would produce zero profit. The asymmetry is caused by the relatively long constant periods of ES, making the SPY latency more relevant for the money at the end of the day.

Build Better Strategies! Part 2: Model-Based Systems

jcl — Fri, 25 Dec 2015 12:34:08 +0000

Trading systems come in two flavors: model-based and data-mining. This article deals with model based strategies. Even when the basic algorithms are not complex, properly developing them has its difficulties and pitfalls (otherwise anyone would be doing it). A significant market inefficiency gives a system only a relatively small edge. Any little mistake can turn a winning strategy into a losing one. And you will not necessarily notice this in the backtest.

Developing a model-based strategy begins with the market inefficiency that you want to exploit. The inefficiency produces a price anomaly or price pattern that you can describe with a qualitative or quantitative model. Such a model predicts the current price y_t from the previous price y_t-1 plus some function f of a limited number of previous prices plus some noise term ε:

[pmath size=16]y_t ~=~ y_{t-1} + f(y_{t-1},~…, ~y_{t-n}) + epsilon[/pmath]

The time distance between the prices y_t is the time frame of the model; the number n of prices used in the function f is the lookback period of the model. The higher the predictive f term in relation to the nonpredictive ε term, the better is the strategy. Some traders claim that their favorite method does not predict, but ‘reacts on the market’ or achieves a positive return by some other means. On a certain trader forum you can even encounter a math professor who re-invented the grid trading system, and praised it as non-predictive and even able to trade a random walk curve. But systems that do not predict y_t in some way must rely on luck; they only can redistribute risk, for instance exchange a high risk of a small loss for a low risk of a high loss. The profit expectancy stays negative. As far as I know, the professor is still trying to sell his grid trader, still advertising it as non-predictive, and still regularly blowing his demo account with it.

Trading by throwing a coin loses the transaction costs. But trading by applying the wrong model – for instance, trend following to a mean reverting price series – can cause much higher losses. The average trader indeed loses more than by random trading (about 13 pips per trade according to FXCM statistics). So it’s not sufficient to have a model; you must also prove that it is valid for the market you trade, at the time you trade, and with the used time frame and lookback period.

Not all price anomalies can be exploited. Limiting stock prices to 1/16 fractions of a dollar is clearly an inefficiency, but it’s probably difficult to use it for prediction or make money from it. The working model-based strategies that I know, either from theory or because we’ve been contracted to code some of them, can be classified in several categories. The most frequent are:

1. Trend

Momentum in the price curve is probably the most significant and most exploited anomaly. No need to elaborate here, as trend following was the topic of a whole article series on this blog. There are many methods of trend following, the classic being a moving average crossover. This ‘hello world’ of strategies (here the scripts in R and in C) routinely fails, as it does not distinguish between real momentum and random peaks or valleys in the price curve.

The problem: momentum does not exist in all markets all the time. Any asset can have long non-trending periods. And contrary to popular belief this is not necessarily a ‘sidewards market’. A random walk curve can go up and down and still has zero momentum. Therefore, some good filter that detects the real market regime is essential for trend following systems. Here’s a minimal Zorro strategy that uses a lowpass filter for detecting trend reversal, and the MMI indicator for determining when we’re entering trend regime:

function run()
{
  vars Price = series(price());
  vars Trend = series(LowPass(Price,500));
	
  vars MMI_Raw = series(MMI(Price,300));
  vars MMI_Smooth = series(LowPass(MMI_Raw,500));
	
  if(falling(MMI_Smooth)) {
    if(valley(Trend))
      reverseLong(1);
    else if(peak(Trend))
      reverseShort(1);
  }
}

The profit curve of this strategy:

Momentum strategy profit curve

(For the sake of simplicity all strategy snippets on this page are barebone systems with no exit mechanism other than reversal, and no stops, trailing, parameter training, money management, or other gimmicks. Of course the backtests mean in no way that those are profitable systems. The P&L curves are all from EUR/USD, an asset good for demonstrations since it seems to contain a little bit of every possible inefficiency).

2. Mean reversion

A mean reverting market believes in a ‘real value’ or ‘fair price’ of an asset. Traders buy when the actual price is cheaper than it ought to be in their opinion, and sell when it is more expensive. This causes the price curve to revert back to the mean more often than in a random walk. Random data are mean reverting 75% of the time (proof here), so anything above 75% is caused by a market inefficiency. A model:

[pmath size=16]y_t ~=~ y_{t-1} ~-~ 1/{1+lambda}(y_{t-1}- hat{y}) ~+~ epsilon[/pmath]

[pmath size=16]y_t[/pmath] = price at bar t
[pmath size=16]hat{y}[/pmath] = fair price
[pmath size=16]lambda[/pmath] = half-life factor
[pmath size=16]epsilon[/pmath] = some random noise term

The higher the half-life factor, the weaker is the mean reversion. The half-life of mean reversion in price series ist normally in the range of 50-200 bars. You can calculate λ by linear regression between y_t-1 and (y_t-1-y_t). The price series need not be stationary for experiencing mean reversion, since the fair price is allowed to drift. It just must drift less as in a random walk. Mean reversion is usually exploited by removing the trend from the price curve and normalizing the result. This produces an oscillating signal that can trigger trades when it approaches a top or bottom. Here’s the script of a simple mean reversion system:

function run()
{
  vars Price = series(price());
  vars Filtered = series(HighPass(Price,30));
  vars Signal = series(FisherN(Filtered,500));
  var Threshold = 1.0;

  if(Hurst(Price,500) < 0.5) { // do we have mean reversion?
    if(crossUnder(Signal,-Threshold))
      reverseLong(1); 
    else if(crossOver(Signal,Threshold))
      reverseShort(1);
  }
}

The highpass filter dampens all cycles above 30 bars and thus removes the trend from the price curve. The result is normalized by the Fisher transformation which produces a Gaussian distribution of the data. This allows us to determine fixed thresholds at 1 and -1 for separating the tails from the resulting bell curve. If the price enters a tail in any direction, a trade is triggered in anticipation that it will soon return into the bell’s belly. For detecting mean reverting regime, the script uses the Hurst Exponent. The exponent is 0.5 for a random walk. Above 0.5 begins momentum regime and below 0.5 mean reversion regime.

Mean reversion profit curve

3. Statistical Arbitrage

Strategies can exploit the similarity between two or more assets. This allows to hedge the first asset by a reverse position in the second asset, and this way derive profit from mean reversion of their price difference:

[pmath size=16]y ~=~ h_1 y_1 – h_2 y_2[/pmath]

where y₁ and y₂ are the prices of the two assets and the multiplication factors h₁ and h₂ their hedge ratios. The hedge ratios are calculated in a way that the mean of the difference y is zero or a constant value. The simplest method for calculating the hedge ratios is linear regression between y₁ and y₂. A mean reversion strategy as above can then be applied to y.

The assets need not be of the same type; a typical arbitrage system can be based on the price difference between an index ETF and its major stock. When y is not stationary – meaning that its mean tends to wander off slowly – the hedge ratios must be adapted in real time for compensating. Here is a proposal using a Kalman Filter by a fellow blogger.

The simple arbitrage system from the R tutorial:

require(quantmod)

symbols <- c("AAPL", "QQQ")
getSymbols(symbols)

#define training set
startT  <- "2007-01-01"
endT    <- "2009-01-01"
rangeT  <- paste(startT,"::",endT,sep ="")
tAAPL   <- AAPL[,6][rangeT]
tQQQ   <- QQQ[,6][rangeT]
 
#compute price differences on in-sample data
pdtAAPL <- diff(tAAPL)[-1]
pdtQQQ <- diff(tQQQ)[-1]
 
#build the model
model  <- lm(pdtAAPL ~ pdtQQQ - 1)
 
#extract the hedge ratio (h1 is assumed 1)
h2 <- as.numeric(model$coefficients[1])

#spread price (in-sample)
spreadT <- tAAPL - h2 * tQQQ
 
#compute statistics of the spread
meanT    <- as.numeric(mean(spreadT,na.rm=TRUE))
sdT      <- as.numeric(sd(spreadT,na.rm=TRUE))
upperThr <- meanT + 1 * sdT
lowerThr <- meanT - 1 * sdT
 
#run in-sample test
spreadL  <- length(spreadT)
pricesB  <- c(rep(NA,spreadL))
pricesS  <- c(rep(NA,spreadL))
sp       <- as.numeric(spreadT)
tradeQty <- 100
totalP   <- 0

for(i in 1:spreadL) {
     spTemp <- sp[i]
     if(spTemp < lowerThr) {
        if(totalP <= 0){
           totalP     <- totalP + tradeQty
           pricesB[i] <- spTemp
        }
     } else if(spTemp > upperThr) {
       if(totalP >= 0){
          totalP <- totalP - tradeQty
          pricesS[i] <- spTemp
       }
    }
}

4. Price constraints

A price constraint is an artificial force that causes a constant price drift or establishes a price range, floor, or ceiling. The most famous example was the EUR/CHF price cap mentioned in the first part of this series. But even after removal of the cap, the EUR/CHF price has still a constraint, this time not enforced by the national bank, but by the current strong asymmetry in EUR and CHF buying power. An extreme example of a ranging price is the EUR/DKK pair (see below). All such constraints can be used in strategies to the trader’s advantage.

EUR/DKK price range 2006-2016

5. Cycles

Non-seasonal cycles are caused by feedback from the price curve. When traders believe in a ‘fair price’ of an asset, they often sell or buy a position when the price reaches a certain distance from that value, in hope of a reversal. Or they close winning positions when the favorite price movement begins to decelerate. Such effects can synchronize entries and exits among a large number of traders, and cause the price curve to oscillate with a period that is stable over several cycles. Often many such cycles are superposed on the curve, like this:

[pmath size=16]y_t ~=~ hat{y} ~+~ sum{i}{}{a_i sin(2 pi t/C_i+D_i)} ~+~ epsilon[/pmath]

When you know the period C_i and the phase D_i of the dominant cycle, you can calculate the optimal trade entry and exit points as long as the cycle persists. Cycles in the price curve can be detected with spectral analysis functions – for instance, fast Fourier transformation (FFT) or simply a bank of narrow bandpass filters. Here is the frequency spectrum of the EUR/USD in October 2015:

EUR/USD spectrum, cycle length in bars

Exploiting cycles is a little more tricky than trend following or mean reversion. You need not only the cycle length of the dominant cycle of the spectrum, but also its phase (for triggering trades at the right moment) and its amplitude (for determining if there is a cycle worth trading at all). This is a barebone script:

function run()
{
  vars Price = series(price());
  var Phase = DominantPhase(Price,10);
  vars Signal = series(sin(Phase+PI/4));
  vars Dominant = series(BandPass(Price,rDominantPeriod,1));
  ExitTime = 10*rDominantPeriod;
  var Threshold = 1*PIP;
	
  if(Amplitude(Dominant,100) > Threshold) {
    if(valley(Signal))
      reverseLong(1); 
    else if(peak(Signal))
      reverseShort(1);
  }
}

The DominantPhase function determines both the phase and the cycle length of the dominant peak in the spectrum; the latter is stored in the rDominantPeriod variable. The phase is converted to a sine curve that is shifted ahead by π/4. With this trick we’ll get a sine curve that runs ahead of the price curve. Thus we do real price prediction here, only question is if the price will follow our prediction. This is determined by applying a bandpass filter centered at the dominant cycle to the price curve, and measuring its amplitude (the a_i in the formula). If the amplitude is above a threshold, we conclude that we have a strong enough cycle. The script then enters long on a valley of the run-ahead sine curve, and short on a peak. Since cycles are shortlived, the duration of a trade is limited by ExitTime to a maximum of 10 cycles.

We can see from the P&L curve that there were long periods in 2012 and 2014 with no strong cycles in the EUR/USD price curve.

6. Clusters

The same effect that causes prices to oscillate can also let them cluster at certain levels. Extreme clustering can even produce “supply” and “demand” lines (also known as “support and resistance“), the favorite subjects in trading seminars. Expert seminar lecturers can draw support and resistance lines on any chart, no matter if it’s pork belly prices or last year’s baseball scores. However the mere existence of those lines remains debatable: There are few strategies that really identify and exploit them, and even less that really produce profits. Still, clusters in price curves are real and can be easily identified in a histogram similar to the cycles spectogram.

7. Curve patterns

They arise from repetitive behavior of traders. Traders not only produce, but also believe in many curve patterns; most – such as the famous ‘head and shoulders’ pattern that is said to predict trend reversal – are myths (at least I have not found any statistical evidence of it, and heard of no other any research that ever confirmed the existence of predictive heads and shoulders in price curves). But some patterns, for instance “cups” or “half-cups”, really exist and can indeed precede an upwards or downwards movement. Curve patterns – not to be confused with candle patterns – can be exploited by pattern-detecting methods such as the Fréchet algorithm.

A special variant of a curve pattern is the Breakout – a sudden momentum after a long sidewards movement. Is can be caused, for instance, by trader’s tendency to place their stop losses as a short distance below or above the current plateau. Triggering the first stops then accelerates the price movement until more and more stops are triggered. Such an effect can be exploited by a system that detects a sidewards period and then lies in wait for the first move in any direction.

8. Seasonality

“Season” does not necessarily mean a season of a year. Supply and demand can also follow monthly, weekly, or daily patterns that can be detected and exploited by strategies. For instance, the S&P500 index is said to often move upwards in the first days of a month, or to show an upwards trend in the early morning hours before the main trading session of the day. Since seasonal effects are easy to exploit, they are often shortlived, weak, and therefore hard to detect by just eyeballing price curves. But they can be found by plotting a day, week, or month profile of average price curve differences.

9. Gaps

When market participants contemplate whether to enter or close a position, they seem to come to rather similar conclusions when they have time to think it over at night or during the weekend. This can cause the price to start at a different level when the market opens again. Overnight or weekend price gaps are often more predictable than price changes during trading hours. And of course they can be exploited in a strategy. On the Zorro forum was recently a discussion about the “One Night Stand System“, a simple currency weekend-gap trader with mysterious profits.

10. Autoregression and heteroskedasticity

The latter is a fancy word for: “Prices jitter a lot and the jittering varies over time”. The ARIMA and GARCH models are the first models that you encounter in financial math. They assume that future returns or future volatility can be determined with a linear combination of past returns or past volatility. Those models are often considered purely theoretical and of no practical use. Not true: You can use them for predicting tomorrow’s price just as any other model. You can examine a correlogram – a statistics of the correlation of the current return with the returns of the previous bars – for finding out if an ARIMA model fits to a certain price series. Here are two excellent articles by fellow bloggers for using those models in trading strategies: ARIMA+GARCH Trading Strategy on the S&P500 and Are ARIMA/GARCH Predictions Profitable?

11. Price shocks

Price shocks often happen on Monday or Friday morning when companies or organizations publish good or bad news that affect the market. Even without knowing the news, a strategy can detect the first price reactions and quickly jump onto the bandwagon. This is especially easy when a large shock is shaking the markets. Here’s a simple Forex portfolio strategy that evaluates the relative strengths of currencies for detecting price shocks:

function run()
{
  BarPeriod = 60;
  ccyReset();
  string Name;
  while(Name = (loop(Assets)))
  {
    if(assetType(Name) != FOREX) 
      continue; // Currency pairs only
    asset(Name);
    vars Prices = series(priceClose());
    ccySet(ROC(Prices,1)); // store price change as strength
  }
  
// get currency pairs with highest and lowest strength difference
  string Best = ccyMax(), Worst = ccyMin();
  var Threshold = 1.0; // The shock level

  static char OldBest[8], OldWorst[8];	// static for keeping contents between runs
  if(*OldBest && !strstr(Best,OldBest)) { // new strongest asset?
    asset(OldBest);
    exitLong();
    if(ccyStrength(Best) > Threshold) {
      asset(Best);
      enterLong();
    }
  } 
  if(*OldWorst && !strstr(Worst,OldWorst)) { // new weakest asset?
    asset(OldWorst);
    exitShort();
    if(ccyStrength(Worst) < -Threshold) {
      asset(Worst);
      enterShort();
    }
  }

// store previous strongest and weakest asset names  
  strcpy(OldBest,Best);
  strcpy(OldWorst,Worst);
}

The equity curve of the currency strength system (you’ll need Zorro 1.48 or above):

Price shock exploiting system

The blue equity curve above reflects profits from small and large jumps of currency prices. You can clearly identify the Brexit and the CHF price cap shock. Of course such strategies would work even better if the news could be early detected and interpreted in some way. Some data services provide news events with a binary valuation, like “good” or “bad”. Especially of interest are earnings reports, as provided by data services such as Zacks or Xignite. Depending on which surprises the earnings report contains, stock prices and implied volatilities can rise or drop sharply at the report day, and generate quick profits.

For learning what can happen when news are used in more creative ways, I recommend the excellent The Fear IndexFear Index by Robert Harris – a mandatory book in any financial hacker’s library.

This was the second part of the Build Better Strategies series. The third part will deal with the process to develop a model-based strategy, from inital research up to building the user interface. In case someone wants to experiment with the code snippets posted here, I’ve added them to the 2015 scripts repository. They are no real strategies, though. The missing elements – parameter optimization, exit algorithms, money management etc. – will be the topic of the next part of the series.

⇒ Build Better Strategies – Part 3