This section gives some details on the SHAZAM calculations used for obtaining results from weighted least squares estimation. The calculations are demonstrated with the Griffiths, Hill and Judge data set on household expenditure that was analyzed in the section on testing for heteroskedasticity.
The SHAZAM command file (filename:
WLSCALC.SHA
)
below first estimates the household
expenditure function by weighted least squares using the
WEIGHT=
option on the OLS
command.
Then the estimation procedure is repeated by first transforming the
data and then applying OLS to the transformed data.
It can be verified that both estimation approaches give the
same results.
SAMPLE 1 40 READ (GHJ.txt) FOOD INCOME * Construct the weight variable GENR W=1/INCOME * Weighted Least Squares OLS FOOD INCOME / WEIGHT=W RESID=V0 PREDICT=YHAT0 * * Now use a manual approach to get the WLS estimates * Transform the data - normalize the weights ?STAT W / MEAN=WMEAN GENR WFOOD=FOOD*SQRT(W/WMEAN) GENR WINC=INCOME*SQRT(W/WMEAN) GENR CONST=SQRT(W/WMEAN) * Estimate the transformed model by OLS OLS WFOOD WINC CONST / NOCONSTANT RESID=V1 PREDICT=YHAT1 COEF=BETA * * Verify that the two estimation approaches give the same results * for the weighted residuals and the weighted predicted values. PRINT V0 V1 FOOD YHAT0 YHAT1 * * R-square calculation based on the original data GENR E2=V1*V1 ?STAT E2 / SUMS=WSSE ?STAT FOOD / WEIGHT=W CPDEV=WSST MEAN=YMEAN GEN1 R2 = 1 - WSSE/WSST PRINT R2 * To compute elasticities at the mean we must also use the original data. ?STAT INCOME / WEIGHT=W MEAN=XMEAN GEN1 ELAS=BETA(1)*XMEAN/YMEAN PRINT ELAS STOP
The OLS
command has a number of options for saving
results from the model estimation. The following options are used
in the above SHAZAM program.
COEF= | Saves the estimated coefficients in the variable specified. |
PREDICT= | Saves the predicted values (fitted values) in the variable specified. |
RESID= | Saves the estimated residuals in the variable specified. |
The ? prefix to a command instructs SHAZAM to suppress output from the command. This is useful when the sole purpose of the command is to obtain results for later calculations.
The weights are normalized to sum to the number of observations by dividing the weight variable by its mean. Transformed variables are then constructed by multiplying each observation of the dependent and explanatory variables by the square root of the normalized weight variable. OLS is then applied to the transformed model. (Note that the parameter estimates and standard errors are not affected by the normalization).
A feature to note is that the original constant term is now transformed
to be a variable. The above program generates this in the
variable with the name CONST
. This variable is
included in the list of explanatory variables for the OLS
command and the option NOCONSTANT
is specified to
ensure that a constant will not be included in the model estimation.
When the WEIGHT=
option is used on the OLS
command the estimated residuals, saved with the RESID=
option,
are the transformed (i.e. weighted) residuals. These residuals
are assumed to have the homoskedastic property and so these are the
residuals that should be used in testing procedures.
Note that the untransformed residuals and predicted values can be
calculated by applying the weighted least squares estimates to the original
data. This calculation is performed by specifying the
UT
option on the OLS / WEIGHT=
command.
It should be recognized that the R-square reported from OLS estimation of the transformed model does not provide a useful measure of goodness of fit for the original model (the dependent variables are measured differently). One approach is to calculate the untransformed residuals by applying the WLS estimates to the original data and then compute the measure:
1 - SSE / SST
This R-square is not guaranteed to be in the interval [0, 1].
The approach that SHAZAM uses is to obtain an analysis of variance decomposition based on weighted descriptive statistics such that:
WSST = WSSR + WSSE
where WSSE
is the sum of squared transformed residuals and
WSST
is the weighted sum of squared deviations from the
weighted mean of the dependent variable in original units.
The analysis of variance table is printed when the ANOVA
option is specified on the OLS / WEIGHT=
command.
The R-square that SHAZAM reports with weighted least
squares estimation is computed as:
1 - WSSE / WSST
Now consider the computation of elasticities at the mean. Once again,
it is appropriate to use the original data and not the transformed data.
SHAZAM evaluates the elasticities using the weighted means of the
explanatory variables and the dependent variable.
For the example here, the estimated coefficients from the weighted least
squares procedure are saved in the
variable with the name BETA
. The first element
BETA(1)
refers to the parameter estimate on INCOME
.
The STAT
command is used with the WEIGHT=
option to saved the weighted means of FOOD
and INCOME
in the scalar variables YMEAN
and XMEAN
respectively.
The elasticity evaluated at the weighted means is then computed as:
BETA(1) * XMEAN / YMEAN
The SHAZAM output below shows that this computation gives the identical
result to the one automatically reported on the estimation output
from the OLS / WEIGHT=
procedure.
The SHAZAM output for this example follows. The estimation results match those reported in Griffiths, Hill and Judge [1993, Equation 15.1.21, p.489].
|_SAMPLE 1 40 |_READ (GHJ.txt) FOOD INCOME UNIT 88 IS NOW ASSIGNED TO: GHJ.txt 2 VARIABLES AND 40 OBSERVATIONS STARTING AT OBS 1 |_* Construct the weight variable |_GENR W=1/INCOME |_* Weighted Least Squares |_OLS FOOD INCOME / WEIGHT=W RESID=V0 PREDICT=YHAT0 OLS ESTIMATION 40 OBSERVATIONS DEPENDENT VARIABLE = FOOD ...NOTE..SAMPLE RANGE SET TO: 1, 40 SUM OF LOG(SQRT(ABS(WEIGHT))) = -1.0012 R-SQUARE = .4177 R-SQUARE ADJUSTED = .4024 VARIANCE OF THE ESTIMATE-SIGMA**2 = 37.695 STANDARD ERROR OF THE ESTIMATE-SIGMA = 6.1396 SUM OF SQUARED ERRORS-SSE= 1432.4 MEAN OF DEPENDENT VARIABLE = 22.012 LOG OF THE LIKELIHOOD FUNCTION = -129.323 VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 38 DF P-VALUE CORR. COEFFICIENT AT MEANS INCOME .25519 .4888E-01 5.221 .000 .646 .6463 .7373 CONSTANT 5.7821 3.257 1.776 .084 .277 .0000 .2627 |_* |_* Now use a manual approach to get the WLS estimates |_* Transform the data - normalize the weights |_?STAT W / MEAN=WMEAN |_GENR WFOOD=FOOD*SQRT(W/WMEAN) |_GENR WINC=INCOME*SQRT(W/WMEAN) |_GENR CONST=SQRT(W/WMEAN) |_* Estimate the transformed model by OLS |_OLS WFOOD WINC CONST / NOCONSTANT RESID=V1 PREDICT=YHAT1 COEF=BETA OLS ESTIMATION 40 OBSERVATIONS DEPENDENT VARIABLE = WFOOD ...NOTE..SAMPLE RANGE SET TO: 1, 40 R-SQUARE = .0388 R-SQUARE ADJUSTED = .0135 VARIANCE OF THE ESTIMATE-SIGMA**2 = 37.695 STANDARD ERROR OF THE ESTIMATE-SIGMA = 6.1396 SUM OF SQUARED ERRORS-SSE= 1432.4 MEAN OF DEPENDENT VARIABLE = 22.556 LOG OF THE LIKELIHOOD FUNCTION = -128.322 RAW MOMENT R-SQUARE = .9344 VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 38 DF P-VALUE CORR. COEFFICIENT AT MEANS WINC .25519 .4888E-01 5.221 .000 .646 .3991 .7460 CONST 5.7821 3.257 1.776 .084 .277 .1519 .2530 |_* |_* Verify that the two estimation approaches give the same results |_* for the weighted residuals and the weighted predicted values. |_PRINT V0 V1 FOOD YHAT0 YHAT1 V0 V1 FOOD YHAT0 YHAT1 -4.571970 -4.571970 9.460000 19.41593 19.41593 -5.415589 -5.415589 10.56000 19.79279 19.79279 -2.223623 -2.223623 14.81000 20.34042 20.34042 4.662707 4.662707 21.71000 20.65882 20.65882 5.376165 5.376165 22.79000 20.77774 20.77774 -.4317459E-01 -.4317459E-01 18.19000 20.81512 20.81512 4.015121 4.015121 22.00000 20.88399 20.88399 -1.014446 -1.014446 18.12000 21.06508 21.06508 3.768728 3.768728 23.13000 21.25642 21.25642 -.8445697 -.8445697 19.00000 21.29992 21.29992 -.7750214 -.7750214 19.46000 21.42850 21.42850 -3.082849 -3.082849 17.83000 21.62127 21.62127 12.38121 12.38121 32.81000 21.64575 21.64575 .8699330 .8699330 22.13000 21.77654 21.77654 2.122321 2.122321 23.46000 21.80848 21.80848 -5.094688 -5.094688 16.81000 21.97086 21.97086 -1.241711 -1.241711 21.35000 22.20592 22.20592 -7.689270 -7.689270 14.87000 22.24211 22.24211 8.787936 8.787936 33.00000 22.57284 22.57284 1.350760 1.350760 25.19000 22.57777 22.57777 -5.997791 -5.997791 17.77000 22.70109 22.70109 -1.612695 -1.612695 22.44000 22.70274 22.70274 -1.261713 -1.261713 22.87000 22.72164 22.72164 2.163242 2.163242 26.52000 22.72164 22.72164 -3.278811 -3.278811 21.00000 22.82103 22.82103 11.83694 11.83694 37.52000 22.88751 22.88751 -2.926177 -2.926177 21.69000 22.93017 22.93017 1.952555 1.952555 27.40000 23.05803 23.05803 3.749690 3.749690 30.69000 23.44109 23.44109 -6.266927 -6.266927 19.56000 23.50864 23.50864 3.273085 3.273085 30.58000 23.56636 23.56636 12.29417 12.29417 41.12000 23.62889 23.62889 -10.20401 -10.20401 15.38000 23.63457 23.63457 -9.439260 -9.439260 17.87000 24.31232 24.31232 -3.055578 -3.055578 25.54000 24.31232 24.31232 7.853787 7.853787 39.00000 24.40421 24.40421 -7.871177 -7.871177 20.44000 24.58022 24.58022 -1.234324 -1.234324 30.10000 25.07225 25.07225 -10.45614 -10.45614 20.90000 26.05767 26.05767 9.992188 9.992188 48.71000 26.15905 26.15905 |_* |_* R-square calculation based on the original data |_GENR E2=V1*V1 |_?STAT E2 / SUMS=WSSE |_?STAT FOOD / WEIGHT=W CPDEV=WSST MEAN=YMEAN |_GEN1 R2 = 1 - WSSE/WSST |_PRINT R2 R2 .4177030 |_* To compute elasticities at the mean we must also use the original data. |_?STAT INCOME / WEIGHT=W MEAN=XMEAN |_GEN1 ELAS=BETA(1)*XMEAN/YMEAN |_PRINT ELAS ELAS .7373181 |_STOP