* Structural Breaks in the US Gasoline Market * * Keywords: * regression, ols, log, gasoline, market, us, chow, test, structural break * * Description: * We illustrate how to conduct a Chow Structural Break test on a log-log * OLS model for the U.S. per capita gasoline consumption * * Author(s): * Noel Roy * Skif Pankov * * Source: * William H. Greene, Econometric Analysis - 7th Edition * Pearson International Edition, Chapter 6, Example 6.9 (page 212) * * Setting the first time period to be equal to year 1953 with periodicity of * one year TIME 1953.0 1 SAMPLE 1953.0 2004.0 * Reading the datafile and naming the variables, specifying to ignore the * first line of the file read (TableF2-2.shd) year gasexp pop gasp income pnc puc ppt pd pn ps / skiplines = 1 * Generating logs of variables genr lngpop = log(gasexp/pop/gasp) genr lnincome = log(income) genr lnpg = log(gasp) genr lnpnc = log(pnc) genr lnpuc = log(puc) * Replicating figure 6.5 genr g = gasexp / gasp graph gasp g / nokey * Calculating time trend genr t = year - 1952 * Running an OLS regression of lngpop on lnpg, lnincome, lnpnc, lnpuc and t, * specifying that it's a log-log model ols lngpop lnpg lnincome lnpnc lnpuc t / loglog * Testing for a structural break in the model after the opec price shock in 1973 * with a Chow test by using a diagnos command * diagnos / chowone = 21 * Calculating the chow test statistic the "long way" (saving the estimated * coefficients and the covariance matrix of coefficients using the coef = * and cov = options for use in calculating the test statistic * ols lngpop lnpg lnincome lnpnc lnpuc t gen1 sse=$sse gen1 k=$k sample 1 21 ols lngpop lnpg lnincome lnpnc lnpuc t / coef = theta1 cov = v1 gen1 sse1=$sse gen1 n1=$n sample 22 52 ols lngpop lnpg lnincome lnpnc lnpuc t / coef = theta2 cov = v2 gen1 sse2=$sse gen1 n2=$n * Computing and outputting the test statistic. gen1 df1 = k gen1 df2 = n1+n2-2*k gen1 f = ((sse-sse1-sse2)/df1)/((sse1+sse2)/df2) print f df1 df2 * Computing the probability density function (pdf) and the cummulative density * function (cdf) for variable f, specifying that it has an F-distribution with k and * df degrees of freedom - this gives the value of the test statistic distrib f / type = f df1 = df1 df2 = df2 * Testing whether the observations for 1974, 1975, 1980, and 1981 * are consistent with the unrestricted estimate * Defining the sample as 1953-1973, 1976-1979, 1982-2004. * the four years 1974, 1975, 1980, and 1981 are excluded sample 1 11 14 17 20 52 * Running an OLS regression for the selected sample ?ols lngpop lnpg lnincome lnpnc lnpuc t * Computing the test statistic (6-15). gen1 df1 = 4 gen1 df2 = $n-k gen1 fstat = ((sse-$sse)/df1)/($sse/df2) print fstat df1 df2 distrib fstat / type=f df1=df1 df2=df2 * An alternative method of calculating this statistic takes the full sample * with dummy variables for the years of which there is a structural break. * These can be created using the dum function, or, alternatively, by the * if command * Restoring the full sample sample 1 52 * Defining the dummy variables genr y1974 = 0 genr y1975 = 0 genr y1980 = 0 genr y1981 = 0 * The if command sets a variable at a certain value if a logical condition * is satisfied. if (year .eq. 1974) y1974 = 1 if (year .eq. 1975) y1975 = 1 if (year .eq. 1980) y1980 = 1 if (year .eq. 1981) y1981 = 1 * Estimating the equation with dummy variables included ?ols lngpop lnpg lnincome lnpnc lnpuc t y1974 y1975 y1980 y1981 * Computing the test statistic gen1 df1 = 4 gen1 df2 = $n-$k gen1 f = ((sse-$sse)/df1)/(($sse)/df2) print f df1 df2 distrib f / type=f df1=df1 df2=df2 * Estimating the pooled model with different constant terms con1 and * con2. The constants can be generaged using the if command genr con1 = 1 genr con2 = 1 if (year .le. 1973) con2 = 0 if (year .gt. 1973) con1 = 0 * Running an OLS regression, specifying not to use a constant (since * it is included implicitly via con1 and con2 variables) ols lngpop con1 con2 lnpg lnincome lnpnc lnpuc t / noconstant * Computing the test statistic gen1 df1 = k-1 gen1 df2 = n1+n2-2*k gen1 f = (($sse-sse1-sse2)/df1)/((sse1+sse2)/df2) print f df1 df2 distrib f / type=f df1=df1 df2=df2 * Suppose that in the restricted model, the coefficients of ly, lpg, * and the constant may differ in the two periods. genr ly1 = con1*lnincome genr ly2 = con2*lnincome genr lpg1 = con1*lnpg genr lpg2 = con2*lnpg ?ols lngpop con1 con2 ly1 ly2 lpg1 lpg2 lnpnc lnpuc t / noconstant * Computing the test statistic. gen1 df1 = k-3 gen1 df2 = n1+n2-2*k gen1 f = (($sse-sse1-sse2)/df1)/((sse1+sse2)/df2) print f df1 df2 distrib f / type=f df1=df1 df2=df2 * Testing the null hypothesis that the difference between the parameters * of the two time periods is zero by computing the Wald statistic through * matrix manipulation capabilities matrix w = (theta1-theta2)'inv(v1+v2)(theta1-theta2) * Outputting w print w * The wald statistic is asymptotic chi-square, so we can use the distrib * command to calculate its p value distrib w / type=chi df=k stop