SHAZAM Testing for Autocorrelation

## Testing for Autocorrelation

The following options on the `OLS` command can be used to obtain test statistics for detecting the presence of autocorrelation in the residuals.

 `RSTAT` Lists residual statistics including the Durbin-Watson statistic. `DWPVALUE` Computes the p-value for the Durbin-Watson test statistic. `DLAG` Computes Durbin's h statistic as a test for AR(1) errors when lagged dependent variables are included as regressors. The one-period lagged dependent variable must be listed as the first explanatory variable.

#### Appendix [SHAZAM Guide home]

### Using the Durbin-Watson test

The Durbin-Watson test statistic is designed for detecting errors that follow a first-order autoregressive process. This statistic also fills an important role as a general test of model misspecification. See, for example, the discussion in Gujarati [1995, pp. 462-464].

The `DWPVALUE` option on the `OLS` command computes a p-value for the Durbin-Watson test statistic. Suppose the Durbin-Watson test statistic, d, has a calculated value of DW. For a test of the null hypothesis of no autocorrelation in the errors against the alternative hypothesis of positive autocorrelation the p-value is:

p-value = P(d < DW)

The computation of a p-value is useful if the Durbin-Watson test statistic falls in the inconclusive range given in statistical tables. If the p-value is less than a selected level of significance (say 0.05) then there is evidence to reject the null hypothesis.

If the alternative hypothesis of interest is negative autocorrelation then the p-value is:

p-value = P(d > DW) = 1 `-` P(d < DW)

Following the `OLS / DWPVALUE` command the p-value for the Durbin-Watson test is available in the temporary variable `\$CDF`. Therefore, when testing for negative autocorrelation, a p-value can be computed with the commands:

 ```OLS . . . / DWPVALUE GEN1 PVAL=1-\$CDF PRINT PVAL ```

#### Example

This example uses the Theil textile data set. The SHAZAM commands (filename: `DW.SHA`) below first estimate an equation with `PRICE` as the explanatory variable. But economic theory suggests that `INCOME` is an important variable in a demand equation. A statistical result is that if important variables are omitted from the regression then the OLS estimator is biased. The second OLS regression is the preferred model specification that includes both `PRICE` and `INCOME` as explanatory variables.

 ```SAMPLE 1 17 READ (THEIL.txt) YEAR CONSUME INCOME PRICE OLS CONSUME PRICE / RSTAT DWPVALUE * Now include the variable INCOME in the regression equation OLS CONSUME INCOME PRICE / RSTAT DWPVALUE * Compute a p-value for testing for negative autocorrelation GEN1 PVAL=1-\$CDF PRINT PVAL STOP ```

The SHAZAM output can be inspected. The first OLS regression reports the results:

 ```DURBIN-WATSON STATISTIC = 1.19071 DURBIN-WATSON POSITIVE AUTOCORRELATION TEST P-VALUE = 0.018346 NEGATIVE AUTOCORRELATION TEST P-VALUE = 0.981655 ```

The estimation uses 17 observations and there are 2 estimated coefficients (including the intercept parameter). If we ignore the p-value and rely on tables printed at the end of textbooks we find that the lower and upper critical values are 1.133 and 1.381 (for a 5% significance level) and 0.874 and 1.102 (for a 1% significance level). When compared with the reported Durbin-Watson statistic the finding is that at a 5% level there is evidence for positive autocorrelation but at the 1% level the null hypothesis of no autocorrelation is not rejected. The computed p-value verifies this conclusion.

When the variable `INCOME` is added to the regression the SHAZAM estimation results report:

 ```DURBIN-WATSON STATISTIC = 2.01855 DURBIN-WATSON POSITIVE AUTOCORRELATION TEST P-VALUE = 0.301270 NEGATIVE AUTOCORRELATION TEST P-VALUE = 0.698730 ```

By inspecting the p-value, the conclusion is that when both `PRICE` and `INCOME` are included in the regression there is no evidence to reject the null hypothesis of no autocorrelation in the errors.

The regression equation that omitted `INCOME` showed evidence for autocorrelated errors. However, this appears to reflect that an important variable has been omitted - rather than a need to correct for autocorrelation. That is, the omitted variable `INCOME` is highly autocorrelated and when this variable is included in the regression (as economic theory would typically suggest) the autocorrelation in the residuals disappears.

#### SHAZAM output with Durbin-Watson test statistics

```|_SAMPLE 1 17
|_READ (THEIL.txt) YEAR CONSUME INCOME PRICE
UNIT 88 IS NOW ASSIGNED TO: THEIL.txt
4 VARIABLES AND       17 OBSERVATIONS STARTING AT OBS       1

|_OLS CONSUME PRICE / RSTAT DWPVALUE

OLS ESTIMATION
17 OBSERVATIONS     DEPENDENT VARIABLE = CONSUME
...NOTE..SAMPLE RANGE SET TO:      1,     17

DURBIN-WATSON STATISTIC  =   1.19071
DURBIN-WATSON POSITIVE AUTOCORRELATION TEST P-VALUE =    0.018346
NEGATIVE AUTOCORRELATION TEST P-VALUE =    0.981655

R-SQUARE =   0.8961     R-SQUARE ADJUSTED =   0.8892
VARIANCE OF THE ESTIMATE-SIGMA**2 =   61.594
STANDARD ERROR OF THE ESTIMATE-SIGMA =   7.8482
SUM OF SQUARED ERRORS-SSE=   923.91
MEAN OF DEPENDENT VARIABLE =   134.51
LOG OF THE LIKELIHOOD FUNCTION = -58.0829

VARIABLE   ESTIMATED  STANDARD   T-RATIO        PARTIAL STANDARDIZED ELASTICITY
NAME    COEFFICIENT   ERROR      15 DF   P-VALUE CORR. COEFFICIENT  AT MEANS
PRICE     -1.3233     0.1163      -11.38     0.000-0.947    -0.9466    -0.7508
CONSTANT   235.49      9.079       25.94     0.000 0.989     0.0000     1.7508

DURBIN-WATSON = 1.1907    VON NEUMANN RATIO = 1.2651    RHO =  0.38554
RESIDUAL SUM =  0.00000      RESIDUAL VARIANCE =   61.594
SUM OF ABSOLUTE ERRORS=   102.14
R-SQUARE BETWEEN OBSERVED AND PREDICTED = 0.8961
RUNS TEST:    6 RUNS,    9 POS,    0 ZERO,    8 NEG  NORMAL STATISTIC = -1.7451

|_* Now include the variable INCOME in the regression equation

|_OLS CONSUME INCOME PRICE / RSTAT DWPVALUE

OLS ESTIMATION
17 OBSERVATIONS     DEPENDENT VARIABLE = CONSUME
...NOTE..SAMPLE RANGE SET TO:      1,     17

DURBIN-WATSON STATISTIC  =   2.01855
DURBIN-WATSON POSITIVE AUTOCORRELATION TEST P-VALUE =    0.301270
NEGATIVE AUTOCORRELATION TEST P-VALUE =    0.698730

R-SQUARE =   0.9513     R-SQUARE ADJUSTED =   0.9443
VARIANCE OF THE ESTIMATE-SIGMA**2 =   30.951
STANDARD ERROR OF THE ESTIMATE-SIGMA =   5.5634
SUM OF SQUARED ERRORS-SSE=   433.31
MEAN OF DEPENDENT VARIABLE =   134.51
LOG OF THE LIKELIHOOD FUNCTION = -51.6471

VARIABLE   ESTIMATED  STANDARD   T-RATIO        PARTIAL STANDARDIZED ELASTICITY
NAME    COEFFICIENT   ERROR      14 DF   P-VALUE CORR. COEFFICIENT  AT MEANS
INCOME     1.0617     0.2667       3.981     0.001 0.729     0.2387     0.8129
PRICE     -1.3830     0.8381E-01  -16.50     0.000-0.975    -0.9893    -0.7846
CONSTANT   130.71      27.09       4.824     0.000 0.790     0.0000     0.9718

DURBIN-WATSON = 2.0185    VON NEUMANN RATIO = 2.1447    RHO = -0.18239
RESIDUAL SUM = -0.53291E-14  RESIDUAL VARIANCE =   30.951
SUM OF ABSOLUTE ERRORS=   72.787
R-SQUARE BETWEEN OBSERVED AND PREDICTED = 0.9513
RUNS TEST:    7 RUNS,    9 POS,    0 ZERO,    8 NEG  NORMAL STATISTIC = -1.2423

|_* Compute a p-value for testing for negative autocorrelation
|_GEN1 PVAL=1-\$CDF
..NOTE..CURRENT VALUE OF \$CDF =  0.30127
|_PRINT PVAL
PVAL
0.6987301
|_STOP
```