Simmons to make money Cheats: application when hidden Markov model (HMM) Optional

Summary :

Simmons is being widely sought after God quantify quantify the circle, the fund's medallion created numerous myths. Founder of the early days, there is a scientist who invented a widely used in speech recognition and other fields Baum - Wales algorithm. Hidden Markov Models (HMM) has been successfully applied in the field of engineering, and achieved important results of scientific significance and application value.

This article Simmons Medallion Fund weapon - hidden Markov model to predict the stock market in our country, by pattern recognition stock data sequences to predict the market trend .

 

I. Introduction: Let's talk about the medallion

Simmons is being widely sought after quantization circle of God quantify, in 2008 the global financial crisis, most hedge funds lost money in the context of its gains as high as 80%.

Simons founded Renaissance Technologies has a group of physicists and mathematicians, this group of people get together in the end come up with what weapon to make money? The outside world has been speculated. Opinions, of which the hidden Markov model is also some reason to be elected out.

This article will be hidden Markov model to predict the stock market in our country, through to predict the market trend of the stock data sequence pattern recognition.

 

Two, HMM examples and principles

First a brief review of the Markov chain. Markov chain is a discrete event stochastic process has the Markov Index science of nature. Markov of mathematically expressed as follows:

Markov classic example:

According to the current weather to predict future weather conditions. One way is to assume that the daily weather conditions are dependent only on the day before the state. The following shows the state transition diagram of Markov model to predict the weather:

Weather prediction model is assumed a state transition matrix as follows:

The matrix representation, if yesterday was cloudy, so today there are 25% likely to be sunny, 12.5% ​​probability cloudy, 62.5% probability of rain, it is clear that each row of the matrix and are 1.

To initiate such a system, we need an initial probability vector:

The vector for the first day is sunny. Here, we have defined the following three parts is above a first order Markov process:

Status: sunny, cloudy and rainy.

Initial vector: definition of the system state at the time when the probability is 0.

State transition matrix: each weather transition probability. All such systems can be described is a Markov process.

 However, in some cases, not enough to describe Markov process we want to find patterns. Establish HMM model based on the observation sequence and implicit variables, there are certain advantages in pattern recognition.

HMM (Hidden Markov) classic example:

Suppose there are three different dice. 1 dice has six faces, known as D6, each number appears to be facing the probability of 1/6; dice 2 has four surfaces, known as D4, each number appears to be facing the probability is 1/4; 3 dice with eight faces, called D8, each number appears to be facing the probability is 1/8.

Now dice 10, and so to give a string of numbers assuming: 1635273524 This string is called the observation sequence. But in the hidden Markov model, so we just have a bunch of visible chains, as well as a bunch of hidden state chain. In this example, this string sequence chain is your status with dice implied. For example, there may be a hidden state chain: D4 D6 D8 D6 D4 D8 D6 D6 D6 D4.

In general, HMM Markov chain comes in fact, refers to the implicit state chain, because the transition probabilities between the implicit state (dice).

In our example, the next state D6 is D4, D6, D8 probability is 1/3. D4, D8 is the next state D4, D6, D8 transition probabilities are also the same as 1/3. This is a setting for the beginning easy to clear, but in fact we are free to set the transition probabilities. For example, we can define, can not take back D6 D4, D6 behind D6 is the probability is 0.9, the probability D8 is 0.1. This is a new HMM.

No visible transition probabilities between states, but there are hidden between the state and a visible state output probability is called probability. For our example, the six-sided dice (D6) to obtain an output probability of each number is 1/6 (assuming that the dice have not been tampered with).

And hidden state sequence the state sequence in these cases the above, it can be observed is related to the probability. So we can be process modeling this type of a hidden Markov process and a Markov process associated probability and the state can be observed with this hidden collection, is the hidden Markov model (Hidden Markov Model ), referred to as HMM.

HMM algorithm and three questions:

For the stock market we often face the following problems: we hope can be observed based on the limited information (price, volume and volatility) to predict the stock price behind we can not know the drivers, and even predict the ups and downs of the stock price principle, this prediction modeling process with no shortage of HMM have many similarities.

Construction of HMM model, focusing respectively solve three kinds of problems :

Question 1: know that there are several dice (the number of implicit state), what each of the dice is (transition probability), according to the results of dice throws (visible chain), I want to know is what every throw out dice (implicit state chain).

Question 2: know that there are several dice (the number of implicit state), what each of the dice is (transition probability), according to the results of dice throws (visible chain), I want to know the probability throw this result.

Question 3: know that there are several dice (the number of implicit state), I do not know what each of the dice is (transition probability), the results observed many times roll of the dice (visible chain), I would like to launch anti-dice is what each (transition probability).

Third, the application HMM in Prediction of Stock

 

Application of HMM in speech recognition process:

(1) First, the input speech is extracted from the respective digital signature sequences, and to train the model, local optimal parameter estimation. HMM speech recognition model training process as shown below:

(2) Next, input speech recognition is required, by extracting corresponding digital signature sequence, then use the forward - backward algorithm likelihood estimation models of various types, to obtain the maximum output probability model, in order to achieve recognition. Model HMM speech recognition process shown below:

HMM-based pattern recognition model for stock market predictions:

(1) First, in accordance with the prior classification, select the date and number of weeks before the date of the shares belong to the same data on the trend of history, extract some characteristic index (the transaction price, volume, etc.) to form the corresponding stock data in sequence as inputs to the model, application of Baum-Welch algorithm and various models of training, the training process as shown below:

(2) Secondly, according to the trained HMM model, features selected stock index (traded price, volume, etc.) as an input sequence, several weeks of the application forward - backward algorithm to calculate the probability of occurrence of each model, select the maximum the corresponding probability model to obtain recognition of the results of the next phase of the stock movement. FIG identified as follows:

Four, HMM strategy empirical results

 

4.1 Policy Description

Optional Index: CSI 300 Index;

Time interval: 2007/07/23 to 2016/09/09;

We are based on different market rise and fall, turnover, turnover and orders daily amount of active data structure such as the observation sequence variable as follows:

X1:  Stock Daily Return Rate;

X2:  funds net inflow proportion of all daily liquidity of the day;

X3:  total daily liquidity chain;

X4:  Standardized cash flow, namely :( daily total liquidity - last year the average working capital / liquidity volatility over the past year.

X5:  turnover ratio on the Environment;

X6:  turnover ratio on the Environment;

X7:  standardize turnover day.

 

Choose different configurations observed variables combined variable and based on the underlying index Change case all week samples are divided into two categories (respectively up, down), respectively, using different combinations of the observed variables HMM model training.

4.2, based on the principle of HMM when the index selection

HMM1 of HMM2 based on and, after the probability of new observations as input variables is determined based on the level of ebb next week to tape, long and short operations on the index: (points up, two down) to give the corresponding training sample data are based on different types of models. In addition, in order to avoid a greater loss model continuous prediction errors of strategy, we added a stop signal mechanism: when the last signal Open up strategy accumulated losses reach a certain threshold (eg 5%), the current positions were open until the next appears opposite signal and then reopening

4.3 Strategy performance

(1) does not consider short

If when the signal is empty, the index short positions, without regard to short, then at 450 weeks between 2007 20 July 2016 September 9 JCP, issued a total of 62 times and 61 times the buy signal a sell signal, stop signal loss of signal 8 times. The average trading signals once every 3.8 weeks. Among them, accurately predicted the results of 250 weeks, the accuracy rate was 56%, the cumulative rate of return of 183% strategy, annualized 21.1%.

(2) consider the short

If when the signal is empty, the index opened short positions, at 450 weeks between 2007 July 20 2016 September 9 JCP, issued a total of 62 times and 61 times the buy signal a sell signal, the signal stop signal 16, due to the stop-loss and short positions 31 weeks. The average trading signals once every 3.8 weeks. Among them, accurately predicted the results of 250 weeks, the accuracy rate was 56%, the cumulative rate of return of 899% strategy, annualized 103.9%.

V. Summary

 

5.1, significance and innovation

The report was first proposed to be introduced into the stock price volatility forecasting problems HMM pattern recognition model, by addressing learning problems and identify issues HMM model is established when a stock-based daily return and the date of cash flow and other variables on the stock index selection model, the empirical test, whether it is to predict when the policy of return accuracy and choose the model have achieved relatively good results, with considerable theoretical and practical significance.

Because the algorithm on HMM quite mature, and possessed high efficiency, good effect and easy to train the model through existing data and other characteristics, so choose HMM model stock volatility pattern recognition is not only a great innovation, it is a worthy Discussion choice.

Less than 5.2, the model

Forecast accuracy (1) model to be further improved;

Select (2) the input vector is the key HMM model, there are limitations of this article is only for input configuration variable share price, exchange rate and capital flow, stock market information that can be extracted.

-------------------------

Further Reading:

1. a quantitative strategist Confessions (Good text strongly recommended)

2. classic quantitative trading strategies available in the market are here! (Source)

3. futures / stock data Daquan query (History / real-time / Tick / finance, etc.)

4. Dry |, an important model, a brief history of the classical theory of quantification financial Daquan

5. From the high-frequency trading to quantify, can not read five books

6. HFT four factions Big Secret

Published an original article · won praise 6 · views 4942

Guess you like

Origin blog.csdn.net/zk168_net/article/details/104690445