Working with lagged variablesRegression equations that use time series data often contain lagged variables. For example, consider the regression equation: Yt =
0 +
1
Yt where et is a random error term and the total number
of observations in the data set is T.
This equation contains a lagged dependent variable
as an explanatory variable. This is called an
autoregressive model or a dynamic model.
Note that the sample period is adjusted to start at observation 2.
This is because the first observation is "lost" when a lagged variable
is required. So the estimation now uses T Another example of a model with lagged variables is: Yt =
0 +
1 Xt +
2
Xt This model includes current and lagged values of the explanatory variables as regressors. This is called a distributed-lag model. In SHAZAM lagged variables are created by using the
In general, for an n-period lag, the command format is:
where Some important rules must be followed when the
Example - Regression with a Lagged Dependent VariableThis example uses a data set on monthly sales and advertising expenditures of a dietary weight control product. It is expected that the impact of advertising expenditures (variable name ADVERT) on sales (variable name SALES) will be distributed over a number of months. A model that captures the lagged advertising effects is: SALESt =
+
SALESt The coefficients , , and can be estimated by the method of ordinary least squares. However, the presence of the lagged dependent variable means that the OLS estimation rule does not give a linear unbiased estimator. It follows that hypothesis testing will only be approximately valid. A result that can be established is that if the error process is serially uncorrelated then the lagged dependent variable will be uncorrelated with the current period error and the OLS estimator will be consistent (close to the true parameter value with high probability in large samples). By repeated substitution for SALESt
With || < 1 this gives a pattern of exponentially declining impacts as time goes on. The total increase in sales over all current and future time periods is the sum:
(1 + +
2 +
3 + . . . )
=
This is the result for the sum of an infinite geometric series when || < 1. After only k time periods the effect is:
(1 + + . . . +
k )
=
(1 Thus at time k, the percentage of the total advertising effect realized is: 100 (1 The above can be solved to find the time period k at which 100p percent of the impact on sales is expected. This gives:
k = log(1 The SHAZAM commands (filename:
The SHAZAM output can be viewed. The estimated equation is:
SALESt = 7.45 + 0.528 SALESt The results show that a 1 unit increase in advertising gives a 0.146 unit increase in sales in the current month. However, the total expected increase in sales in the current and all future months is calculated as: 0.146 / (1 - 0.528) = 0.310 The time period at which 95% of the effect is realized is found as: log(1 - 0.95)/log(0.528) - 1 = 3.69 This implies that after 4 months more than 95% of the advertising effect will be reflected in the sales performance. The figure below shows the month by month sales response to advertising in the current month.
[SHAZAM Guide home] SHAZAM output - Regression with a Lagged Dependent Variable|_SAMPLE 1 36 |_READ (SALES.txt) SALES ADVERT UNIT 88 IS NOW ASSIGNED TO: SALES.txt 2 VARIABLES AND 36 OBSERVATIONS STARTING AT OBS 1 |_GENR L1SALES=LAG(SALES) ..NOTE.LAG VALUE IN UNDEFINED OBSERVATIONS SET TO ZERO |_* List the data and take a look |_PRINT SALES L1SALES ADVERT SALES L1SALES ADVERT 12.00000 .0000000 15.00000 20.50000 12.00000 16.00000 21.00000 20.50000 18.00000 15.50000 21.00000 27.00000 15.30000 15.50000 21.00000 23.50000 15.30000 49.00000 24.50000 23.50000 21.00000 21.30000 24.50000 22.00000 23.50000 21.30000 28.00000 28.00000 23.50000 36.00000 24.00000 28.00000 40.00000 15.50000 24.00000 3.000000 17.30000 15.50000 21.00000 25.30000 17.30000 29.00000 25.00000 25.30000 62.00000 36.50000 25.00000 65.00000 36.50000 36.50000 46.00000 29.60000 36.50000 44.00000 30.50000 29.60000 33.00000 28.00000 30.50000 62.00000 26.00000 28.00000 22.00000 21.50000 26.00000 12.00000 19.70000 21.50000 24.00000 19.00000 19.70000 3.000000 16.00000 19.00000 5.000000 20.70000 16.00000 14.00000 26.50000 20.70000 36.00000 30.60000 26.50000 40.00000 32.30000 30.60000 49.00000 29.50000 32.30000 7.000000 28.30000 29.50000 52.00000 31.30000 28.30000 65.00000 32.30000 31.30000 17.00000 26.40000 32.30000 5.000000 23.40000 26.40000 17.00000 16.40000 23.40000 1.000000 |_* Adjust the sample period |_SAMPLE 2 36 |_OLS SALES L1SALES ADVERT / COEF=BETA OLS ESTIMATION 35 OBSERVATIONS DEPENDENT VARIABLE = SALES ...NOTE..SAMPLE RANGE SET TO: 2, 36 R-SQUARE = .6720 R-SQUARE ADJUSTED = .6515 VARIANCE OF THE ESTIMATE-SIGMA**2 = 12.142 STANDARD ERROR OF THE ESTIMATE-SIGMA = 3.4845 SUM OF SQUARED ERRORS-SSE= 388.53 MEAN OF DEPENDENT VARIABLE = 24.606 LOG OF THE LIKELIHOOD FUNCTION = -91.7859 VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 32 DF P-VALUE CORR. COEFFICIENT AT MEANS L1SALES .52793 .1021 5.170 .000 .675 .5478 .5252 ADVERT .14647 .3308E-01 4.428 .000 .616 .4692 .1721 CONSTANT 7.4469 2.470 3.015 .005 .470 .0000 .3027 |_* Analyze the effect of a 1 unit increase in advertising. |_GEN1 A=BETA:1 |_GEN1 B=BETA:2 |_* Get the total impact of advertising on all future sales. |_GEN1 TOTAL = B/(1-A) |_* Find the time period at which 95% of the impact is expected. |_GEN1 P95 = LOG(1-.95)/LOG(A) - 1 |_PRINT TOTAL P95 TOTAL .3102750 P95 3.689629 |_* Find the expected increases in sales for up to 6 months ahead. |_SAMPLE 1 7 |_GENR AHEAD=TIME(-1) |_GENR IMPACT = B*(A**AHEAD) |_PRINT AHEAD IMPACT AHEAD IMPACT .0000000 .1464728 1.000000 .7732677E-01 2.000000 .4082280E-01 3.000000 .2155141E-01 4.000000 .1137755E-01 5.000000 .6006502E-02 6.000000 .3170988E-02 |_STOP [SHAZAM Guide home] |