Hypothesis Testing

## Hypothesis Testing

The standard OLS estimation output from SHAZAM reports a t-ratio for testing the null hypothesis that the true regression coefficient is zero. When the regression equation contains more than 1 explanatory variable it may be of interest to test the null hypothesis that all slope coefficients are jointly equal to zero. This is called a test of the overall significance of the regression line. The F-test statistic for this test is computed with the `ANOVA` option on the `OLS` command.

In practice, the economist is likely to be interested in other types of hypotheses that may involve linear (or nonlinear) combinations of the regression coefficients.

#### Testing a single linear combination of coefficients

Test statistics are computed with the `TEST` command that immediately follows the estimation command. With OLS estimation, the general format of commands for testing a single hypothesis is:

 ```OLS depvar indeps / options TEST equation ```

The `equation` is specified as a function of the variables in the `indeps` list on the `OLS` command. Note: The variable names actually represent the coefficients involved in the hypothesis test. If a hypothesis test involving the intercept coefficient is required then the name `CONSTANT` can be used to represent the intercept.

The SHAZAM output reports a t-test statistic and a p-value for a 2-sided test. The null hypothesis can be rejected if the p-value is less than a selected level of significance (say, 0.05).

One-tailed tests can also be considered. For example, consider testing hypotheses about some unknown parameter . Suppose the null and alternative hypotheses are:

H0: < c       and     H1: > c

where c is some scalar constant.

The test statistic for the one-tailed test is computed in the same way as for a two-tailed test. However, the null hypothesis will be rejected only if the value of the test statistic is excessively large (giving support to the alternative hypothesis). Suppose that p is the p-value reported for the two-tailed test. The p-value for the inequality hypotheses stated above can be computed as follows:

If the test statistic is positive the p-value is p`/`2.
If the test statistic is negative the p-value is 1`-`p`/`2.

#### Testing more than one linear combination of coefficients

A test statistic for a joint test that involves two or more functions of the coefficients can be obtained in SHAZAM with the general command format:

 ```OLS depvar indeps / options TEST TEST equation1 TEST equation2 . . . END ```

The tests involved in the hypothesis are enclosed between a header that is a blank `TEST` command and an `END` command.

Typically, an assumption in hypothesis testing is that the residuals are normally distributed. This assumption is then used to determine the distribution of the test statistic.

#### Example

This example uses the Theil textile data set to illustrate hypothesis testing in SHAZAM. The textile demand equation is specified in log-log form so that the parameter estimates have interpretations as income elasticities and price elasticities. A number of hypotheses about consumer behaviour can be tested. For example, a negative price elasticity is expected. A price elasticity that is less than 1 in absolute value implies that demand is price inelastic.

The command file (filename: `TEST.SHA`) below transforms the data to logarithms and estimates the demand equation by OLS. A series of hypothesis tests are then considered.

 ```SAMPLE 1 17 READ (THEIL.txt) YEAR CONSUME INCOME PRICE * Transform the data to logarithms GENR LC=LOG(CONSUME) GENR LY=LOG(INCOME) GENR LP=LOG(PRICE) * Estimate the log-log model OLS LC LY LP / LOGLOG ANOVA * Hypothesis testing TEST LY=1 TEST LP=-1 * * A joint test TEST TEST LY=1 TEST LP=-1 END * * Now duplicate the F-test that is reported with the ANOVA option TEST TEST LY=0 TEST LP=0 END STOP ```

Note that the indentation used for the `TEST` commands is optional and is intended to improve the readability of the command file. Tab marks should not be used for indentation - the space bar should be used for this.

The SHAZAM output can be viewed. The `ANOVA` option on the `OLS` command produces the following output.

```                      ANALYSIS OF VARIANCE - FROM MEAN
SS         DF             MS                 F
REGRESSION        .51733          2.        .25867               266.018
ERROR             .13613E-01     14.        .97236E-03           P-VALUE
TOTAL             .53094         16.        .33184E-01              .000
```

A test of the null hypothesis that all slope coefficients are zero reports a F-test statistic of `266`. The p-value is reported as `.000` (this actually means less than `.0005`) and so there is strong evidence to reject the null hypothesis and conclude that the estimated relationship is a significant one. Note that a critical value for the test is obtained from a F-distribution with (2,14) degrees of freedom.

Possibly more interesting tests about consumer behaviour are given with the `TEST` commands that follow the model estimation. The model estimation reports the following:

``` VARIABLE   ESTIMATED  STANDARD   T-RATIO        PARTIAL STANDARDIZED ELASTICITY
NAME    COEFFICIENT   ERROR      14 DF   P-VALUE CORR. COEFFICIENT  AT MEANS
LY         1.1432      .1560       7.328      .000  .891      .3216     1.1432
LP        -.82884      .3611E-01  -22.95      .000 -.987    -1.0074     -.8288
CONSTANT   3.1636      .7048       4.489      .001  .768      .0000     3.1636
```

The income elasticity is the estimated coefficient on `LY` and this is reported as `1.1432`. The next output shows the computation of a test statistic for the null hypothesis that the income elasticity is equal to one.

``` |_TEST LY=1
TEST VALUE =   .14316     STD. ERROR OF TEST VALUE   .15600
T STATISTIC =   .91766674     WITH   14 D.F.    P-VALUE=  .37433
```

The `TEST VALUE` reported in the above output is obtained as `1.1432 - 1 = .1432`. (In discussing the output some rounding of results is introduced). Note that the standard error of this test value is identical to the standard error for the coefficient on `LY` that is listed on the OLS estimation output. The t-statistic is computed as `.1432 / .15600 = 0.918 `. For a test of the null hypothesis against the two-sided alternative that the income elasticity is not equal to 1 the computed p-value is `.37`. This suggests that there is no evidence to reject the null hypothesis. For a one-sided test of the null hypothesis that the income elasticity is less than or equal to 1 against the alternative that the income elasticity is greater than 1 the p-value is `0.37433/2 = 0.187`. Again, the null hypothesis is not rejected.

The next output shows the computation of a test statistic for the null hypothesis that the price elasticity is equal to `-1`.

``` |_TEST LP=-1
TEST VALUE =   .17116     STD. ERROR OF TEST VALUE   .36111E-01
T STATISTIC =   4.7398530     WITH   14 D.F.    P-VALUE=  .00032
```

The price elasticity is `-.82884` and the `TEST VALUE` on the above output is computed as
`-.82884 - (-1) = .17116`. The t-statistic is computed by dividing the test value by the standard error. The associated p-value gives strong evidence to reject the null hypothesis.

Individual tests on the income and price elasticities have been considered. Now consider a joint test of the null hypothesis that the income elasticity is `1` and the price elasticity is `-1`. The output below shows the computed F-statistic for this test.

``` |_TEST
|_  TEST LY=1
|_  TEST LP=-1
|_END
F STATISTIC =   13.275308     WITH    2 AND   14 D.F.  P-VALUE=  .00058
```

By consulting printed statistical tables, the 1% critical value from the F-distribution with (2,14) degrees of freedom is `6.51`. The test statistic clearly exceeds this. So the null hypothesis is rejected. The p-value reported on the SHAZAM output gives this conclusion immediately. [SHAZAM Guide home]

### SHAZAM output

``` |_SAMPLE 1 17
|_READ (THEIL.txt) YEAR CONSUME INCOME PRICE

UNIT 88 IS NOW ASSIGNED TO: THEIL.txt
4 VARIABLES AND       17 OBSERVATIONS STARTING AT OBS       1

|_* Transform the data to logarithms
|_GENR LC=LOG(CONSUME)
|_GENR LY=LOG(INCOME)
|_GENR LP=LOG(PRICE)

|_* Estimate the log-log model
|_OLS LC LY LP / LOGLOG ANOVA

OLS ESTIMATION
17 OBSERVATIONS     DEPENDENT VARIABLE = LC
...NOTE..SAMPLE RANGE SET TO:    1,   17

R-SQUARE =    .9744     R-SQUARE ADJUSTED =    .9707
VARIANCE OF THE ESTIMATE-SIGMA**2 =   .97236E-03
STANDARD ERROR OF THE ESTIMATE-SIGMA =   .31183E-01
SUM OF SQUARED ERRORS-SSE=   .13613E-01
MEAN OF DEPENDENT VARIABLE =   4.8864
LOG OF THE LIKELIHOOD FUNCTION(IF DEPVAR LOG) = -46.5862

MODEL SELECTION TESTS - SEE JUDGE ET AL. (1985,P.242)
AKAIKE (1969) FINAL PREDICTION ERROR - FPE =      .11440E-02
(FPE IS ALSO KNOWN AS AMEMIYA PREDICTION CRITERION - PC)
AKAIKE (1973) INFORMATION CRITERION - LOG AIC =  -6.7770
SCHWARZ (1978) CRITERION - LOG SC =              -6.6300
MODEL SELECTION TESTS - SEE RAMANATHAN (1992,P.167)
CRAVEN-WAHBA (1979)
GENERALIZED CROSS VALIDATION - GCV =           .11807E-02
HANNAN AND QUINN (1979) CRITERION =               .11565E-02
RICE (1984) CRITERION =                           .12376E-02
SHIBATA (1981) CRITERION =                        .10834E-02
SCHWARZ (1978) CRITERION - SC =                   .13202E-02
AKAIKE (1974) INFORMATION CRITERION - AIC =       .11397E-02

ANALYSIS OF VARIANCE - FROM MEAN
SS         DF             MS                 F
REGRESSION        .51733          2.        .25867               266.018
ERROR             .13613E-01     14.        .97236E-03           P-VALUE
TOTAL             .53094         16.        .33184E-01              .000

ANALYSIS OF VARIANCE - FROM ZERO
SS         DF             MS                 F
REGRESSION        406.42          3.        135.47            139325.591
ERROR             .13613E-01     14.        .97236E-03           P-VALUE
TOTAL             406.44         17.        23.908                  .000

VARIABLE   ESTIMATED  STANDARD   T-RATIO        PARTIAL STANDARDIZED ELASTICITY
NAME    COEFFICIENT   ERROR      14 DF   P-VALUE CORR. COEFFICIENT  AT MEANS
LY         1.1432      .1560       7.328      .000  .891      .3216     1.1432
LP        -.82884      .3611E-01  -22.95      .000 -.987    -1.0074     -.8288
CONSTANT   3.1636      .7048       4.489      .001  .768      .0000     3.1636

|_* Hypothesis testing
|_TEST LY=1
TEST VALUE =   .14316     STD. ERROR OF TEST VALUE   .15600
T STATISTIC =   .91766674     WITH   14 D.F.    P-VALUE=  .37433
F STATISTIC =   .84211225     WITH    1 AND   14 D.F.  P-VALUE=  .37433
WALD CHI-SQUARE STATISTIC =   .84211225     WITH    1 D.F.  P-VALUE=  .35879
UPPER BOUND ON P-VALUE BY CHEBYCHEV INEQUALITY = 1.00000

|_TEST LP=-1
TEST VALUE =   .17116     STD. ERROR OF TEST VALUE   .36111E-01
T STATISTIC =   4.7398530     WITH   14 D.F.    P-VALUE=  .00032
F STATISTIC =   22.466206     WITH    1 AND   14 D.F.  P-VALUE=  .00032
WALD CHI-SQUARE STATISTIC =   22.466206     WITH    1 D.F.  P-VALUE=  .00000
UPPER BOUND ON P-VALUE BY CHEBYCHEV INEQUALITY =  .04451

|_*
|_* A joint test
|_TEST
|_  TEST LY=1
|_  TEST LP=-1
|_END
F STATISTIC =   13.275308     WITH    2 AND   14 D.F.  P-VALUE=  .00058
WALD CHI-SQUARE STATISTIC =   26.550616     WITH    2 D.F.  P-VALUE=  .00000
UPPER BOUND ON P-VALUE BY CHEBYCHEV INEQUALITY =  .07533

|_*
|_* Now duplicate the F-test that is reported with the ANOVA option
|_TEST
|_  TEST LY=0
|_  TEST LP=0
|_END
F STATISTIC =   266.01794     WITH    2 AND   14 D.F.  P-VALUE=  .00000
WALD CHI-SQUARE STATISTIC =   532.03587     WITH    2 D.F.  P-VALUE=  .00000
UPPER BOUND ON P-VALUE BY CHEBYCHEV INEQUALITY =  .00376
|_STOP
``` [SHAZAM Guide home]