Ordinary Least Squares Regression
The general form of the linear regression equation considers
the relationship between a dependent variable and several explanatory
variables.
This is demonstrated with the Theil textile data set.
Consider estimating the relationship between the dependent variable
CONSUME and the explanatory variables INCOME and
PRICE . The linear regression equation is:
CONSUMEt =
0 +
1
INCOMEt +
2
PRICEt +
t
where t
is a random error term. Ordinary least squares
estimates of the parameters can be obtained with the next SHAZAM commands.
SAMPLE 1 17
READ (THEIL.txt) YEAR CONSUME INCOME PRICE
OLS CONSUME INCOME PRICE
STOP
|
The OLS command contains a list of variables.
The dependent variable must be listed as the first variable name.
All variable names that follow are the explanatory variables.
SHAZAM automatically includes an intercept in the regression equation.
Note: If there is some compelling reason to exclude the
intercept parameter this can be done by specifying the option
NOCONSTANT on the OLS command.
This will then give a regression through the origin.
With the OLS command the explanatory variables can be
listed in any order. On the READ command the variables must be
listed in the order that they appear in the data file.
This does not need to be the order that is used on the OLS
command. So the OLS command:
is equivalent to the command:
The SHAZAM OLS estimation results are below.
|_SAMPLE 1 17
|_READ (THEIL.txt) YEAR CONSUME INCOME PRICE
UNIT 88 IS NOW ASSIGNED TO: THEIL.txt
4 VARIABLES AND 17 OBSERVATIONS STARTING AT OBS 1
|_OLS CONSUME INCOME PRICE
OLS ESTIMATION
17 OBSERVATIONS DEPENDENT VARIABLE = CONSUME
...NOTE..SAMPLE RANGE SET TO: 1, 17
R-SQUARE = .9513 R-SQUARE ADJUSTED = .9443
VARIANCE OF THE ESTIMATE-SIGMA**2 = 30.951
STANDARD ERROR OF THE ESTIMATE-SIGMA = 5.5634
SUM OF SQUARED ERRORS-SSE= 433.31
MEAN OF DEPENDENT VARIABLE = 134.51
LOG OF THE LIKELIHOOD FUNCTION = -51.6471
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 14 DF P-VALUE CORR. COEFFICIENT AT MEANS
INCOME 1.0617 .2667 3.981 .001 .729 .2387 .8129
PRICE -1.3830 .8381E-01 -16.50 .000 -.975 -.9893 -.7846
CONSTANT 130.71 27.09 4.824 .000 .790 .0000 .9718
|_STOP
| |
The intercept estimate (assigned the name CONSTANT )
is listed as the final coefficient estimate.
The estimated equation can be written as:
CONSUME = 130.7 + 1.06 INCOME - 1.38 PRICE + ê
where ê is the estimated residual.
The OLS estimation output for the model with 2 or more explanatory
variables can be interpreted in a similar way to the estimation results
that are obtained for the model with 1 explanatory variable.
That is, the T-RATIO gives the t-statistic for a test
of the null hypothesis that the coefficient is zero. The
P-VALUE gives the associated p-value for a two-sided
test.
Note that the R-square estimate is .9513. What does this mean ?
It says that 95.13% of the variation in the dependent variable
CONSUME has been explained by the regression equation.
This suggests a very "good fit". However, "high" R-square values
can be typical of models that use time series data.
Economic time series may have a similar tendency to follow
an upward or downward trend.
When working with time series data, it is important
to test for the presence of serial correlation in the residuals.
This is discussed later in this guide.
[SHAZAM Guide home]
|