Your endogenous solution is out, ERM has ruled the world and is leading the way

Your endogenous solution is out, ERM has ruled the world and is leading the way
Contributions (recommended) are welcome in the econometrics economy circle, measurement related

Email: [email protected]

copyrights@量经济圈(ID: econometrics666); Related do files and important materials are placed in our knowledge community, which can be extracted and used directly in the community; the econometric circle has special sub-communities to discuss endogenous issues.

Your endogenous solution is out, ERM has ruled the world and is leading the way

There are many ways to deal with endogeneity, such as instrumental variable method, heckman self-selection correction, matching-based processing effect, and so on. We want to introduce an "extended regression model"-Extended regression model. The biggest advantage of this model is that it can deal with the endogenity of explanatory variables or control variables, the non-random distribution of policy variables in the effect, and the endogenous sample selection problem. In essence, these three issues should be endogenous issues, but the ERM model can simultaneously deal with these three issues under one framework.

When you usually deal with these endogenous problems, the ivregress, xtivregress, heckman, ivprobit, heckoprobit, and etregress you use are only a sub-block in the ERM model, because they will give the same result. In view of this, we believe that it is unnecessary for you to learn those various individual programs. After all, the energy of each researcher is limited. It is recommended to master the ERM framework.

First, feel the sub-blocks under the ERM framework: eregress corresponds to the situation where the dependent variable is a continuous variable, eintreg corresponds to the situation where the dependent variable is an interval variable, and eprobit and eoprobit respectively correspond to whether the dependent variable is a binary or ordinal variable. In the case of, there has not yet been a sub-block where the dependent variable is a multi-valued but disordered variable.

Your endogenous solution is out, ERM has ruled the world and is leading the way

The beauty of this ERM framework is that regardless of whether your endogenous variables are continuous, binary or ordered, they can use an option to use instrumental variables for regression. Think about it, the ivprobit program you usually use can actually only process data whose endogenous variables are continuous, while discrete data cannot actually be processed, but many people are also using it and then publishing articles. .

The ERM framework can also allow endogenous variables to interact with other control variables, and also allow the square and cubic terms of endogenous variables to interact with other control variables. This kind of superiority is impossible to gain in programs such as ivregress, not to mention that she can deal with issues such as endogenous selection bias and non-random policy effect distribution at the same time.

Take a closer look at the original programs that deal with endogenous problems corresponding to each sub-block of ERM. Do you think that as long as you understand ERM, you don't need to remember the various procedures before. What's more, ERM can handle the three endogenous problems mentioned above at the same time, whenever and wherever, Simultaneously handle them.

eregress is a linear regression that can replace regress, ivregress, teffects ra, heckman.

Your endogenous solution is out, ERM has ruled the world and is leading the way

eintreg is interval regression, which can replace intreg, tobit, ivtobit models.

Your endogenous solution is out, ERM has ruled the world and is leading the way

eprobit is a binary regression, which can replace probit, ivprobit, teffects ra, heckprobit models.

Your endogenous solution is out, ERM has ruled the world and is leading the way

eoprobit is an ordered regression, which can replace oprobit and heckoprobit models.

Your endogenous solution is out, ERM has ruled the world and is leading the way

Below we give a simple example to let friends in the econometric circle understand the advantages of the ERM framework.

Example one

We need to study whether participating in a training program organized by the school will help students graduate. However, there will be two problems. The first is the non-random training project allocation problem. Students who are more likely to graduate smoothly are more willing to participate in this training project. The second is the GPA in high school as an equation for whether they can graduate successfully in college. The control variables of there will be endogenous problems.

Now is the time to show the superiority of the eprobit model under the ERM framework. Graduate is a two-valued dummy variable, able to graduate successfully = 1, but not able to graduate successfully = 0. It is related to the GPA in high school, whether it is a rich man or not, and how many roommates there are. We know that this non-randomly assigned training program can be predicted by whether you live on campus in the first school year and whether you are a child of a rich family (instrumental variable); and whether the GPA in high school can be used as an endogenous control variable. The competitiveness of wealthy children and high schools are predicted (instrumental variable). Note: Here hsgap is a continuous variable, so there is no option such as probit or oprobit to indicate it, after all, this is the default. The regression results are not shown here, and the results of other examples will be shown and explained later.

eprobit graduate income i.roommate,entreat(program = i.campus income)
endogenous( hsgpa = income i.hscomp)

Example two

Let's take a look at the ordered response regression model when the endogenous variable is a dummy variable. It is worth noting that here we have selected the probit option in the brackets, because the core explanatory variable x is a binary dummy variable. As for why nomain was chosen, it is mainly because we put x after y, which means that if you see two x at the same time, you need to choose nomain. The place where the yellow line is drawn deserves your attention. This corr(ex, ey) is the correlation of the residual terms of these two equations. The latter p-value is significant, so we believe that such a treatment is reasonable, and x is indeed an endogenous dummy variable.

eoprobit y x x1 x2, endogenous(x = i1 i2 i3, probit nomain)

Your endogenous solution is out, ERM has ruled the world and is leading the way

Example three

Let's look at an endogenous problem caused by selection bias and an endogenous problem caused by non-random allocation of policy variables. We want to know whether participating in the medical insurance program is beneficial to the health of students. The endogenous problems that appear in it are clear at a glance-people with worse health may be more willing to participate in the insurance project; and the samples we finally obtain are also likely to be caused by endogenous factors, that is, those students who are not involved or are in poor health. , It is unlikely that the questionnaire will be handed in, so of course this sample needs to be revised.

webuse womenhlth // Use the system's own database

eoprobit health i.exercise c.grade, entreat(insured = grade i.workschool) ///

select(select = i.insured i.regcheck) vce(robust) // See if participating in the insurance program improves student health

Let's take a look at the final result we got. Note: Now there is actually an interaction term between the endogenous processing variable and the main equation. There are three regression equations, corresponding to our three parts, but we just look at the final result. As shown in the table, participating in the school's medical insurance program really promotes the health of students. Whether it is exercise or not, regardless of the student's grade, insured is a positive promotion of health. The following three corr() are also significant, indicating that our approach is reasonable.

Your endogenous solution is out, ERM has ruled the world and is leading the way
Your endogenous solution is out, ERM has ruled the world and is leading the way

Example four

Below, we look at another example. We want to interpret whether the university has increased wages. Whether it is going to university or not is obviously endogenous, so it needs to be estimated through instrumental variables. Note that we added the option probit at the end, because whether or not to go to university is an endogenous dummy variable. This is also the advantage of the ERM framework. You can freely choose the type of your endogenous variables: continuous, dummy, ordinal. Now we are using extreat, which means that the 0-1 variable of college is used as an exogenous processing policy variable. This needs to be distinguished from the previous endogenous processing policy variables, which will have different effects on our estimation results. It is worth reminding that we need to judge by ourselves when the policy processing variables are endogenous (exogenous).

webuse wageed // Use the system's own database

eregress wage c.age##c.age tenure, endogenous(college = i.peduc, probit)

extreat(college) // See if college has increased salary

Next, let's take a look at the results obtained. This result shows that it is always good to go to university, and it has a positive effect on raising wages. You can also look at the results of the interaction between college and so many control variables below to further explain some special mechanism relationships. For example, the coefficients of the interaction term between college and age and the interaction term between college and age squared indicate that the promotion effect of college on salary is also related to age, which is an inverted U-shaped relationship.

Your endogenous solution is out, ERM has ruled the world and is leading the way

Your endogenous solution is out, ERM has ruled the world and is leading the way

Because we used the treatment effect variable college, we can now estimate the average treatment effect.

estat teffects // get the average treatment effect

Your endogenous solution is out, ERM has ruled the world and is leading the way

estat teffects, atet // get the average treatment effect of the treatment group

Your endogenous solution is out, ERM has ruled the world and is leading the way

margins, over(college) // get the marginal effect

Your endogenous solution is out, ERM has ruled the world and is leading the way

marginsplot, plot(college) // marginal effect distribution plot

There are only two points here, because we only selected college in the margins part, and did not choose continuous variables such as age and tenure. We don't need to be skeptical about this diagram as an example. This chart shows that higher wages are obtained when you go to college, which is also very clear.

Your endogenous solution is out, ERM has ruled the world and is leading the way

margins r.college, over(age tenure) predict(fix(college)) // Now let's calculate the role of college in the role of groups in different ages and work experience

marginsplot, by(age) // plot the marginal utility distribution of age

Since it takes a long time to run, we directly use a running graph to explain the result. For example, for students with very low GPA, going to college can improve their wages, as evidenced by the picture on the upper left. For students from families with very low family incomes, if his GPA is at a medium-to-high level, going to college will have a more positive effect on his salary, as shown in the figure on the bottom left. This shows that the characteristics of the relationship between not going to university and wages are different in different groups.

Your endogenous solution is out, ERM has ruled the world and is leading the way

There are also a lot of knowledge content about the ERM framework, which can be discussed in detail in the econometric circle knowledge group.

Guess you like

Origin blog.51cto.com/15057855/2679909