SHAZAM The Chow Test

Testing for Structural Stability - the Chow Test

It may be of interest to test for stability of regression coefficients between two periods. A change in parameters between two periods is an indication of structural change. Following an OLS estimation, the Chow test statistic for structural change is reported with the commands:

OLS . . . DIAGNOS / CHOWONE=n1

where n1 is the number of observations in the first group.

Alternatively, to get test statistics computed for every breakpoint in the data set the following commands can be used.

OLS . . . DIAGNOS / CHOWTEST

The computations required to obtain the Chow test statistic can be illustrated with SHAZAM commands. These commands give an example of programming in SHAZAM.

Example

This example is from Exercise 8.35 of Gujarati [1995, pp. 279-280]. A data set on personal savings and income for the United States is available for the years 1970 to 1991. It is of interest to investigate if there is a significant change in the savings-income relationship for the period 1970-1980 and 1981-1991 (the Reagan-Bush presidency era).

Gujarati suggests that either a linear or log-linear model may be used to estimate a savings-income relationship. The SHAZAM commands (filename: USECON.SHA) below estimate both a linear and log-linear model. After each OLS estimation a Chow test for structural change is computed.

SAMPLE 1 22 READ (USECON.txt) YEAR SAVINGS INCOME * Estimate the savings-income relationship OLS SAVINGS INCOME / RSTAT DWPVALUE DIAGNOS / CHOWONE=11 * Now consider a log-linear relationship GENR LSAV=LOG(SAVINGS) GENR LINC=LOG(INCOME) OLS LSAV LINC / LOGLOG RSTAT DWPVALUE DIAGNOS / CHOWONE=11 STOP

The SHAZAM output can be viewed. For the linear savings-income function the Chow test statistic is reported as:

SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW PVALUE G-Q DF1 DF2 PVALUE 11 11 1010.8 5103.5 20.371 0.000 0.1981 9 9 0.012 CHOW TEST - F DISTRIBUTION WITH DF1= 2 AND DF2= 18

For the log-linear savings-income function the Chow test statistic is reported as:

SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW PVALUE G-Q DF1 DF2 PVALUE 11 11 0.11833 0.15911 13.923 0.000 0.7437 9 9 0.333 CHOW TEST - F DISTRIBUTION WITH DF1= 2 AND DF2= 18

The above SHAZAM output uses the following notation:

N1 no. of observations in group 1

N2 no. of observations in group 2

SSE1 sum of squared errors from a regression for group 1

SSE2 sum of squared errors from a regression for group 2

CHOW the Chow test statistic

G-Q the Goldfeld-Quandt test statistic for testing for
equality of error variance in the 2 groups.
G-Q = (SSE1/DF1)/(SSE2/DF2)

By inspecting the output it can be seen that for both the linear and log-linear model the p-value reported for the Chow test statistic is less than 0.0005. This gives evidence to reject the null hypothesis of equality of regression coefficients in the 2 periods.

It should be considered that any test statistic relies on some distributional assumptions. The derivation of the Chow test assumes that the errors have the same variance (homoskedasticity) in the 2 groups and the errors are independently distributed (that is, no autocorrelation). Are these assumptions reasonable for this example ?

The SHAZAM output for the Chow test statistic also reports the Goldfeld Quandt test statistic for equal variance in the 2 groups. The above output shows that for both the linear and the log-linear model the calculated test statistic is less than 1. The p-value that is reported at the extreme right of the SHAZAM output is the p-value for a test of the null hypothesis of equal variance against the alternative hypothesis of larger variance in the second group compared to the first group. The results show that there is evidence for heteroskedasticity in the linear model. However, the homoskedasticity assumption appears reasonable for the log-linear model.

For both models the Durbin-Watson test statistic rejects the null hypothesis of no autocorrelation in the errors. Therefore, there is evidence for model misspecification.

It may be reasonable to consider that savings behaviour is related to savings in the past. This can be recognized by including a lagged dependent variable as an explanatory variable in the regression equation. The next list of SHAZAM commands show the estimation of a log-linear equation with a lagged dependent variable.

SAMPLE 1 22 READ (USECON.txt) YEAR SAVINGS INCOME GENR LSAV=LOG(SAVINGS) GENR LINC=LOG(INCOME) * Estimate a log-linear model with a lagged dependent variable GENR LSAVL1=LAG(LSAV) * Adjust the sample period SAMPLE 2 22 OLS LSAV LSAVL1 LINC / LOGLOG DLAG DIAGNOS / CHOWONE=11 STOP

Note that the lagged dependent variable is included as the first explanatory variable. The DLAG option is used to obtain Durbin's h test as a test for autocorrelation when the model includes a lagged dependent variable.

The SHAZAM output can be viewed. The results from the DIAGNOS command are:

SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW PVALUE G-Q DF1 DF2 PVALUE 11 10 0.13517 0.15749 1.8444 0.182 0.7510 8 7 0.346 CHOW TEST - F DISTRIBUTION WITH DF1= 3 AND DF2= 15

The Chow test statistic does not reject the null hypothesis of parameter stability and the Goldfeld-Quandt test statistic shows no evidence of heteroskedasticity. Durbin's h test statistic has the value 0.077 and so there is no evidence for autocorrelation in the errors. The conclusion is that the log-linear model with a lagged dependent variable reveals no evidence for a structural change during the Reagan-Bush presidency era.

[SHAZAM Guide home]

SHAZAM commands for computing the Chow test statistic

The commands below show an example of programming in SHAZAM. The commands compute a Chow test statistic for the example given above. A p-value for the test statistic is also computed. The computations should replicate the Chow test statistic that is reported with the DIAGNOS / CHOWTEST command.

SAMPLE 1 22 READ (USECON.txt) YEAR SAVINGS INCOME * Suppress output SET NOOUTPUT * OLS estimation for group 1 SAMPLE 1 11 OLS SAVINGS INCOME * The sum of squared errors is available in the temporary variable $SSE GEN1 SSE1=$SSE GEN1 N1=$N * OLS estimation for group 2 SAMPLE 12 22 OLS SAVINGS INCOME GEN1 SSE2=$SSE GEN1 N2=$N * OLS estimation for the complete sample. SAMPLE 1 22 OLS SAVINGS INCOME GEN1 SSEA=$SSE GEN1 K=$K * Compute the Chow test statistic GEN1 SSEB=SSE1+SSE2 GEN1 DFDEN=N1+N2-2*K GEN1 CHOW=((SSEA-SSEB)/K)/(SSEB/DFDEN) * Get the p-value SAMPLE 1 1 DISTRIB CHOW / TYPE=F DF1=K DF2=DFDEN CDF=CDF1 GEN1 PVAL=1-CDF1 PRINT CHOW PVAL STOP

Note that following model estimation SHAZAM temporary variables are available with some results. These variables start with the $ character. The above commands make use of the following temporary variables available after the OLS command.

$N The number of observations used in the OLS regression

$SSE The sum of squared errors

$K The number of coefficients

[SHAZAM Guide home]

SHAZAM output

|_SAMPLE 1 22 |_READ (USECON.txt) YEAR SAVINGS INCOME UNIT 88 IS NOW ASSIGNED TO: USECON.txt 3 VARIABLES AND 22 OBSERVATIONS STARTING AT OBS 1 |_* Estimate the savings-income relationship |_OLS SAVINGS INCOME / RSTAT DWPVALUE OLS ESTIMATION 22 OBSERVATIONS DEPENDENT VARIABLE = SAVINGS ...NOTE..SAMPLE RANGE SET TO: 1, 22 DURBIN-WATSON STATISTIC = 0.54879 DURBIN-WATSON P-VALUE = 0.000005 R-SQUARE = 0.6396 R-SQUARE ADJUSTED = 0.6216 VARIANCE OF THE ESTIMATE-SIGMA**2 = 997.69 STANDARD ERROR OF THE ESTIMATE-SIGMA = 31.586 SUM OF SQUARED ERRORS-SSE= 19954. MEAN OF DEPENDENT VARIABLE = 136.91 LOG OF THE LIKELIHOOD FUNCTION = -106.128 VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 20 DF P-VALUE CORR. COEFFICIENT AT MEANS INCOME 0.31461E-01 0.5281E-02 5.958 0.000 0.800 0.7998 0.5790 CONSTANT 57.636 14.91 3.865 0.001 0.654 0.0000 0.4210 DURBIN-WATSON = 0.5488 VON NEUMANN RATIO = 0.5749 RHO = 0.70933 RESIDUAL SUM = -0.49738E-13 RESIDUAL VARIANCE = 997.69 SUM OF ABSOLUTE ERRORS= 536.24 R-SQUARE BETWEEN OBSERVED AND PREDICTED = 0.6396 RUNS TEST: 5 RUNS, 9 POS, 0 ZERO, 13 NEG NORMAL STATISTIC = -3.0039 |_DIAGNOS / CHOWONE=11 DEPENDENT VARIABLE = SAVINGS 22 OBSERVATIONS REGRESSION COEFFICIENTS 0.314609350421E-01 57.6356858451 SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW PVALUE G-Q DF1 DF2 PVALUE 11 11 1010.8 5103.5 20.371 0.000 0.1981 9 9 0.012 CHOW TEST - F DISTRIBUTION WITH DF1= 2 AND DF2= 18 |_* Now consider a log-linear relationship |_GENR LSAV=LOG(SAVINGS) |_GENR LINC=LOG(INCOME) |_OLS LSAV LINC / LOGLOG RSTAT DWPVALUE OLS ESTIMATION 22 OBSERVATIONS DEPENDENT VARIABLE = LSAV ...NOTE..SAMPLE RANGE SET TO: 1, 22 DURBIN-WATSON STATISTIC = 0.67040 DURBIN-WATSON P-VALUE = 0.000040 R-SQUARE = 0.8095 R-SQUARE ADJUSTED = 0.8000 VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.35331E-01 STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.18797 SUM OF SQUARED ERRORS-SSE= 0.70663 MEAN OF DEPENDENT VARIABLE = 4.8416 LOG OF THE LIKELIHOOD FUNCTION(IF DEPVAR LOG) = -99.9099 VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 20 DF P-VALUE CORR. COEFFICIENT AT MEANS LINC 0.66122 0.7172E-01 9.219 0.000 0.900 0.8997 0.6612 CONSTANT -0.24110 0.5528 -0.4362 0.667-0.097 0.0000 -0.2411 DURBIN-WATSON = 0.6704 VON NEUMANN RATIO = 0.7023 RHO = 0.64948 RESIDUAL SUM = 0.19429E-15 RESIDUAL VARIANCE = 0.35331E-01 SUM OF ABSOLUTE ERRORS= 3.3446 R-SQUARE BETWEEN OBSERVED AND PREDICTED = 0.8095 R-SQUARE BETWEEN ANTILOGS OBSERVED AND PREDICTED = 0.6859 RUNS TEST: 5 RUNS, 11 POS, 0 ZERO, 11 NEG NORMAL STATISTIC = -3.0585 |_DIAGNOS / CHOWONE=11 DEPENDENT VARIABLE = LSAV 22 OBSERVATIONS REGRESSION COEFFICIENTS 0.661217901001 -0.241102497959 SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW PVALUE G-Q DF1 DF2 PVALUE 11 11 0.11833 0.15911 13.923 0.000 0.7437 9 9 0.333 CHOW TEST - F DISTRIBUTION WITH DF1= 2 AND DF2= 18 |_STOP

[SHAZAM Guide home]

SHAZAM output

|_SAMPLE 1 22 |_READ (USECON.txt) YEAR SAVINGS INCOME UNIT 88 IS NOW ASSIGNED TO: USECON.txt 3 VARIABLES AND 22 OBSERVATIONS STARTING AT OBS 1 |_GENR LSAV=LOG(SAVINGS) |_GENR LINC=LOG(INCOME) |_* Estimate a log-linear model with a lagged dependent variable |_GENR LSAVL1=LAG(LSAV) ..NOTE.LAG VALUE IN UNDEFINED OBSERVATIONS SET TO ZERO |_* Adjust the sample period |_SAMPLE 2 22 |_OLS LSAV LSAVL1 LINC / LOGLOG DLAG OLS ESTIMATION 21 OBSERVATIONS DEPENDENT VARIABLE = LSAV ...NOTE..SAMPLE RANGE SET TO: 2, 22 R-SQUARE = 0.8689 R-SQUARE ADJUSTED = 0.8543 VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.22257E-01 STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.14919 SUM OF SQUARED ERRORS-SSE= 0.40062 MEAN OF DEPENDENT VARIABLE = 4.8792 LOG OF THE LIKELIHOOD FUNCTION(IF DEPVAR LOG) = -90.6883 VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 18 DF P-VALUE CORR. COEFFICIENT AT MEANS LSAVL1 0.62422 0.1767 3.532 0.002 0.640 0.6673 0.6242 LINC 0.20643 0.1360 1.517 0.147 0.337 0.2867 0.2064 CONSTANT 0.27423 0.4841 0.5665 0.578 0.132 0.0000 0.2742 DURBIN-WATSON = 1.9737 VON NEUMANN RATIO = 2.0724 RHO = 0.00984 RESIDUAL SUM = 0.17847E-13 RESIDUAL VARIANCE = 0.22257E-01 SUM OF ABSOLUTE ERRORS= 2.2924 R-SQUARE BETWEEN OBSERVED AND PREDICTED = 0.8689 R-SQUARE BETWEEN ANTILOGS OBSERVED AND PREDICTED = 0.8118 RUNS TEST: 8 RUNS, 11 POS, 0 ZERO, 10 NEG NORMAL STATISTIC = -1.5603 DURBIN H STATISTIC (ASYMPTOTIC NORMAL) = 0.76882E-01 |_DIAGNOS / CHOWONE=11 DEPENDENT VARIABLE = LSAV 21 OBSERVATIONS REGRESSION COEFFICIENTS 0.624221564206 0.206427954104 0.274228157820 SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW PVALUE G-Q DF1 DF2 PVALUE 11 10 0.13517 0.15749 1.8444 0.182 0.7510 8 7 0.346 CHOW TEST - F DISTRIBUTION WITH DF1= 3 AND DF2= 15 |_STOP

[SHAZAM Guide home]