A Monte Carlo Study

### A Monte Carlo Experiment to Demonstrate the Properties of the OLS Estimator

For the two variable linear regression equation the parameter of interest is usually the slope parameter. With the assumptions that the equation errors have zero mean, the explanatory variable is non-random and the model is correctly specified, it can be shown that the OLS estimator is an unbiased estimator of the slope parameter. With the additional assumption that the errors are normally distributed then it follows that the OLS estimator has a normal distribution.

With a sample of data, the OLS estimation rule can be applied to get an estimate of the slope parameter. The OLS estimate will be smaller or larger than the true parameter. However, if estimates were computed from a "large" number of random samples then the average parameter estimate over all samples will be equal to the true parameter value.

The above ideas can be illustrated with the help of computer simulation. Repeated samples of data can be generated that are consistent with a regression model. The properties of the OLS estimation rule can then be analyzed. This is known as a Monte Carlo study.

#### Example

This example is adapted from the presentation in Section 6.5 of Griffiths, Hill and Judge [1993, pp. 219 - 223]. A data set on household expenditure for food was used to obtain OLS estimation results for a food expenditure relationship. It is now considered that these results are not numerical estimates but instead describe the true model for household expenditure on food. That is, assume that the true linear regression model is:

Y = 7.3832 + 0.2323 INCOME + e

where the error term is normally distributed with mean 0 and variance 46.853.

The computer simulation proceeds as follows. For a sample size of N=40 :

1. Use a random number generator to generate a sample of independent and identically distributed errors. The NOR function on the GENR command is used to generate normal random numbers.
2. Calculate sample observations for the variable Y where INCOME is fixed.
3. Apply the OLS estimation rule to obtain OLS estimates of the intercept parameter and the slope parameter.

The above steps are repeated. At each replication a different set of expenditures Y is computed. In this example, the number of replications is set at 1000. The experiment then yields 1000 estimates of the slope parameter using a sample size of N=40. The sampling variability in the estimates can be summarized by plotting the empirical frequency distribution of the estimates.

An interesting question is: What happens to the sampling performance of the OLS estimator as the sample size N is increased ? One way of investigating this is to reproduce each observation in the variable INCOME twice. This then gives a sample size of N=80.

The SHAZAM commands (filename: MCARLO.SHA) below perform the Monte Carlo study. An experiment with N=40 is set-up. The 1000 estimates of the slope parameter are saved in the variable B40. This is followed by an experiment with N=80. The 1000 estimates from this experiment are saved in the variable B80. The command file illustrates various SHAZAM features and the interested user should consult the SHAZAM User's Reference Manual for further details on the SHAZAM commands.

 SAMPLE 1 40 READ (GHJ.txt) FOOD INCOME / CLOSE * Run an OLS regression and save the coefficients in the variable BETA. OLS FOOD INCOME / COEF=BETA * Get the standard error GEN1 SIG=SQRT(\$SIG2) * Set the number of replications for the Monte Carlo experiment. GEN1 NREP=1000 SET RANFIX NODOECHO NOOUTPUT DIM B40 NREP * Use a DO-loop to do repeat operations DO #=1,NREP * Generate random normal numbers with standard deviation SIG. GENR E=NOR(SIG) * Generate Y GENR Y = BETA:2 + BETA:1 * INCOME + E * Run an OLS regression and save the estimated coefficients. OLS Y INCOME / COEF=BTEMP GEN1 B40:#=BTEMP:1 ENDO DELETE FOOD INCOME E Y * Now duplicate the observations and repeat the Monte Carlo experiment. SAMPLE 41 80 READ (GHJ.txt) FOOD INCOME / CLOSE SAMPLE 1 40 READ (GHJ.txt) FOOD INCOME / CLOSE SAMPLE 1 80 DIM B80 NREP DO #=1,NREP GENR E=NOR(SIG) GENR Y = BETA:2 + BETA:1 * INCOME + E OLS Y INCOME / COEF=BTEMP GEN1 B80:#=BTEMP:1 ENDO SET OUTPUT * Analyze the results - a histogram gives a frequency distribution. SAMPLE 1 NREP STAT B40 B80 PLOT B40 / HISTO GROUPS=30 PLOT B80 / HISTO GROUPS=30 STOP

The SHAZAM output can be viewed. The histogram presentation gives one method of showing the frequency distribution of the slope estimate. The figure below shows a smoothed version of the histogram plot. This figure was prepared using nonparametric density estimation that is implemented with the NONPAR command in SHAZAM.

The above figure shows a comparison of the distribution of the slope estimate for a sample size of N=40 and N=80. The graph shows that the distribution of the estimates is approximately normal and an increase in sample size leads to increased precision of the OLS estimator.

[SHAZAM Guide home]

#### SHAZAM output - A Monte Carlo Experiment

|_SAMPLE 1 40
|_READ (GHJ.txt) FOOD INCOME / CLOSE

UNIT 88 IS NOW ASSIGNED TO: GHJ.txt
2 VARIABLES AND       40 OBSERVATIONS STARTING AT OBS       1

|_* Run an OLS regression and save the coefficients in the variable BETA.
|_OLS FOOD INCOME / COEF=BETA

OLS ESTIMATION
40 OBSERVATIONS     DEPENDENT VARIABLE = FOOD
...NOTE..SAMPLE RANGE SET TO:    1,   40

R-SQUARE =    .3171     R-SQUARE ADJUSTED =    .2991
VARIANCE OF THE ESTIMATE-SIGMA**2 =   46.853
STANDARD ERROR OF THE ESTIMATE-SIGMA =   6.8449
SUM OF SQUARED ERRORS-SSE=   1780.4
MEAN OF DEPENDENT VARIABLE =   23.595
LOG OF THE LIKELIHOOD FUNCTION = -132.672

VARIABLE   ESTIMATED  STANDARD   T-RATIO        PARTIAL STANDARDIZED ELASTICITY
NAME    COEFFICIENT   ERROR      38 DF   P-VALUE CORR. COEFFICIENT  AT MEANS
INCOME     .23225      .5529E-01   4.200      .000  .563      .5631      .6871
CONSTANT   7.3832      4.008       1.842      .073  .286      .0000      .3129

|_* Get the standard error
|_GEN1 SIG=SQRT(\$SIG2)
..NOTE..CURRENT VALUE OF \$SIG2=   46.853
|_* Set the number of replications for the Monte Carlo experiment.
|_GEN1 NREP=1000
|_SET RANFIX NODOECHO NOOUTPUT
|_DIM B40 NREP
|_* Use a DO-loop to do repeat operations
|_DO #=1,NREP
|_* Generate random normal numbers with standard deviation SIG.
|_  GENR E=NOR(SIG)
|_* Generate Y
|_  GENR Y = BETA:2 + BETA:1 * INCOME + E
|_* Run an OLS regression and save the estimated coefficients.
|_  OLS Y INCOME / COEF=BTEMP
|_  GEN1 B40:#=BTEMP:1
|_ENDO
|_DELETE FOOD INCOME E Y

|_* Now duplicate the observations and repeat the Monte Carlo experiment.
|_SAMPLE 41 80
|_READ (GHJ.txt) FOOD INCOME / CLOSE

|_SAMPLE 1 40
|_READ (GHJ.txt) FOOD INCOME / CLOSE

|_SAMPLE 1 80
|_DIM B80 NREP
|_DO #=1,NREP
|_  GENR E=NOR(SIG)
|_  GENR Y = BETA:2 + BETA:1 * INCOME + E
|_  OLS Y INCOME / COEF=BTEMP
|_  GEN1 B80:#=BTEMP:1
|_ENDO
|_SET OUTPUT

|_* Analyze the results - a histogram gives a frequency distribution.
|_SAMPLE 1 NREP
|_STAT B40 B80
NAME        N   MEAN        ST. DEV      VARIANCE     MINIMUM      MAXIMUM
B40       1000   .23098       .55417E-01   .30710E-02   .29141E-01   .40478
B80       1000   .23206       .39947E-01   .15957E-02   .12131       .39348

|_PLOT B40 / HISTO GROUPS=30

1000 OBSERVATIONS

GROUP COUNTS
GROUP       1       2       3       4       5       6       7       8
GROUP       9      10      11      12      13      14      15      16
GROUP      17      18      19      20      21      22      23      24
GROUP      25      26      27      28      29      30
COUNT      2.      2.      1.      5.     11.     16.     13.     28.
COUNT     30.     51.     59.     61.     72.     79.     69.     92.
COUNT     74.     65.     53.     42.     60.     31.     27.     16.
COUNT     18.     10.      4.      6.      0.      3.

HISTOGRAM - B40
PCT.    N
.097   97  I
.093   93  I
.089   89  I                              XX
.085   85  I                              XX
.081   81  I                              XX
.077   77  I                          XX  XX
.073   73  I                          XX  XXXX
.069   69  I                        XXXXXXXXXX
.065   65  I                        XXXXXXXXXXXX
.061   61  I                      XXXXXXXXXXXXXX
.057   57  I                    XXXXXXXXXXXXXXXX    XX
.053   53  I                    XXXXXXXXXXXXXXXXXX  XX
.049   49  I                  XXXXXXXXXXXXXXXXXXXX  XX
.045   45  I                  XXXXXXXXXXXXXXXXXXXX  XX
.041   41  I                  XXXXXXXXXXXXXXXXXXXXXXXX
.037   37  I                  XXXXXXXXXXXXXXXXXXXXXXXX
.033   33  I                  XXXXXXXXXXXXXXXXXXXXXXXX
.029   29  I                XXXXXXXXXXXXXXXXXXXXXXXXXXXX
.025   25  I              XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.021   21  I              XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.017   17  I              XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX  XX
.013   13  I          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.009    9  I        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.005    5  I      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX  XX
.001    1  IXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX  XX
I---------I---------I---------I---------I---------I---------I
.647E-01  .120      .176      .231      .286      .342      .397

|_PLOT B80 / HISTO GROUPS=30

1000 OBSERVATIONS

GROUP COUNTS
GROUP       1       2       3       4       5       6       7       8
GROUP       9      10      11      12      13      14      15      16
GROUP      17      18      19      20      21      22      23      24
GROUP      25      26      27      28      29      30
COUNT      0.      1.      4.      3.      5.     13.     21.     32.
COUNT     35.     47.     55.     69.     74.     81.     85.     80.
COUNT     68.     56.     50.     48.     43.     36.     30.     26.
COUNT     16.      9.      4.      3.      3.      3.

HISTOGRAM - B80
PCT.    N
.097   97  I
.093   93  I
.089   89  I
.085   85  I                            XX
.081   81  I                          XXXX
.077   77  I                          XXXXXX
.073   73  I                        XXXXXXXX
.069   69  I                      XXXXXXXXXX
.065   65  I                      XXXXXXXXXXXX
.061   61  I                      XXXXXXXXXXXX
.057   57  I                      XXXXXXXXXXXX
.053   53  I                    XXXXXXXXXXXXXXXX
.049   49  I                    XXXXXXXXXXXXXXXXXX
.045   45  I                  XXXXXXXXXXXXXXXXXXXXXX
.041   41  I                  XXXXXXXXXXXXXXXXXXXXXXXX
.037   37  I                  XXXXXXXXXXXXXXXXXXXXXXXX
.033   33  I                XXXXXXXXXXXXXXXXXXXXXXXXXXXX
.029   29  I              XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.025   25  I              XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.021   21  I            XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.017   17  I            XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.013   13  I          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.009    9  I          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.005    5  I        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.001    1  I  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
I---------I---------I---------I---------I---------I---------I
.112      .152      .192      .232      .272      .312      .352
|_STOP

[SHAZAM Guide home]