Why quantitative analysis of linear regression from the start

Linear regression can use lower cost calculation and analysis, is given based on real data derived, "passable quality" prediction or explanation, although this may not be reasonable to predict or explain and use.

Linear regression analysis (Linealregression analysis) are each a statistical study and quantify the knowledge of a person came into contact with the basic analytical framework. This framework can be traced back to the original proposed by Gauss in 1806, "least squares", when scientists calculate the trajectory of celestial bodies using this method. But after 200 years, a linear regression is still active in all walks of life, including quantitative investment field. Its core ideas and methods, the contribution of other quantitative analysis methods huge. What linear regression analysis to quantify what a unique advantage in the field?

First, linear regression analysis assumed that there is a linear correlation between the independent variable (independent variable) and the dependent variable (dependentvariable), this assumption is very meaningful. Although this assumption may not meet a lot of realities, but there is no problem, we can make some of the variables by converting non-linear relationship becomes linear relationship.

For example, in a model which Y = a + bX2, X variables on the variable Y may be a linear relationship between the power of 2, then we can put into a new variable X2 X ', such that the new equation Y = a + bX' still satisfy the linear relationship, the framework can be applied into the linear regression analysis. On the other hand, although the relationship between reality many variables are non-linear, but as long as we were able to intercept a rather short period of time, then their relationship can always be used to approximate a short straight line. As long as we do not predict too long a period of time, then the prediction accuracy can also have a certain guaranteed.

Secondly, linear regression analysis between changes in assumptions variable linear relationship between its main decisions, but also by factors other random interference, these disturbances led to the prediction errors. Further, linear regression analysis assumes that these random factors interfere with the overall impact is very small.

So how to identify key relationships and random interference it? I thought the least square method is to find a straight line, so that the total distance between all data points and the straight line (random interference errors) is the shortest. So this line will represent a best estimate of a linear relationship, this linear relationship estimated by random interference is minimal.

Finally, and most critical, the linear regression calculation can be relatively low and cost analysis, gives a reality-based data to predict or explain derived, "passable quality" of. Just give the data obtained, the linear regression will always give a definitive answer (unless it is an isolated example). Most of the ready-made software, from the most popular Excel to R2 or MathLab relatively small minority supports linear regression, basically get a key.

But there is also a place to note a few

01 First, correlation does not equal causation, because causal in addition to related, but also as an auxiliary exclusivity. The simplest example is the rooster crowed out of the sun, which is not a causal correlation, because no cock crow, The Sun Also Rises.


02 , two-phase linear prediction accuracy, such as exponential smoothing method is generally not high compared to other methods, because the real correlation data may vary over time. The easiest way to make money algorithms, over a period of time may not make money. This indicates that the correlation between the algorithm and make money declined. So for more long-term data and forecasting, linear regression can only as a rough directional reference.

However, as a basic tool for linear regression method still many domestic and international organizations to quantify adopted. In a lot of metadata among the first substantially linear regression screened a number of variables that may be relevant, then see if there are logical dependencies on the inside, and there are no other algorithms can be more accurately described between the variables relationship. There is a linear regression acted as a sieve, quantitative analysts can save a lot of preparatory work.

Further Reading:

1. a quantitative strategist Confessions (Good text strongly recommended)

2. classic quantitative trading strategies available in the market are here! (Source)

3. futures / stock data Daquan query (History / real-time / Tick / finance, etc.)

4. Dry |, an important model, a brief history of the classical theory of quantification financial Daquan

5. From the high-frequency trading to quantify, can not read five books

6. HFT four factions Big Secret

Published an original article · won praise 6 · views 4937

Guess you like

Origin blog.csdn.net/zk168_net/article/details/104753171