*****************************************************************************
* CHAPTER 17 - STATISTICS FOR BUSINESS & ECONOMICS, 5th Edition             *
*****************************************************************************
* Index Numbers
*
SAMPLE 1 12
READ WEEK PRICE
 1   20.250
 2   19.875
 3   19.000
 4   19.750
 5   20.250
 6   19.875
 7   19.375
 8   19.625
 9   21.125
10   22.375
11   25.000
12   23.000
*
* The GENR command is used to calculate the Price Index, PI, for the Ford
* Motor Company Stock with the first week as the base period.
*
GENR PI=(PRICE/20.250)*100
*
* Replicate Figure 17.1, p. 658
*
PRINT WEEK PRICE PI
*
DELETE / ALL
*----------------------------------------------------------------------------
*
SAMPLE 1 10
READ YEAR PWHEAT PCORN PSOYBEAN AVERAGE
 1  1.33  1.33  2.85  1.84
 2  1.34  1.08  3.03  1.82
 3  1.76  1.57  4.37  2.57
 4  3.95  2.55  5.68  4.06
 5  4.09  3.03  6.64  4.59
 6  3.56  2.54  4.92  3.67
 7  2.73  2.15  6.81  3.90
 8  2.33  2.02  6.42  3.59
 9  2.97  2.25  6.12  3.78
10  3.78  2.52  6.28  4.19
*
* The GENR command is used to calculate the Unweighted Price Index for this
* set of data on the Prices per Bushel of Three Crops in 10 Years.
*
GENR INDEX=(AVERAGE/1.84)*100
*
* Replicate Figure 17.2, p. 659
*
*  Note:  Numbers are not identical to textbook since SHAZAM uses 6
*         significant digits in the calculation.
*
PRINT PWHEAT PCORN PSOYBEAN AVERAGE INDEX
*
*----------------------------------------------------------------------------
*
READ WHEAT CORN SOYBEAN TCOST / LIST
1352  4152  1127  10532
1618  5641  1176  13006
1545  5573  1271  13089
1705  5647  1547  14187
2122  5829  1547  14984
2142  6266  1288  14853
2026  6357  1716  16040
1799  7082  1843  17064
2134  7939  2268  19861
2370  6648  1817  17172
*
* The INDEX commmand computes the price indexes from a set of price and
* quantity data on a number of commodities.  SHAZAM automatically calculates
* the Divisia, Paasche, Laspeyres and Fisher Price and Quantity Indexes
* when the INDEX command is specified.  The BASE= option specifies the
* observation number to be used as the base period for the index.
*
* The format of the command is:
*
*  INDEX p1 q1 p2 q2 p3 q3 ... / options
*
* where:  LASPEYRES= option stores the Laspeyres Price Index in the
*                    vector specified.
*
INDEX PWHEAT WHEAT PCORN CORN PSOYBEAN SOYBEAN / BASE=1 LASPEYRES=PLS
*
* The Laspeyres Price Index is Figure 17.3 is printed below.
*
GENR PINDEX=PLS*100
PRINT PINDEX
*
* Figure 17.4 - Laspeyres Quantity Index for Wheat, Corn and Soybean on page
* 684 is replicated with the INDEX command.  In this case, the quantities
* are specified before the prices.  The QLASPEYRES= option stores the
* Laspeyres Quantity Index in the vector specified.  
*
INDEX WHEAT PWHEAT CORN PCORN SOYBEAN PSOYBEAN / BASE=1 LASPEYRES=QLS
*
* The Laspeyres Quantity Index in Figure 17.4 is printed below.
*
GENR QINDEX=QLS*100
PRINT QINDEX
*
* Recall that the Laspeyres Price Index was previously calculated using
* Year 1 as the base.  Therefore, to print the Laspeyres Price Index for
* the first 5 years, the SAMPLE command is used before the PRINT command.
*
SAMPLE 1 5
PRINT PINDEX
*
* The DIM command is used to dimension a vector called INDX that is 10 rows
* in length.  The format of the DIM command is:
*
*    DIM var size var size ...
*
* where:  var  = name of the vector or matrix to be dimensioned
*         size = either 1 or 2 numbers separated by a space to
*                indicate the size of the var to be dimensioned
*
* The COPY command is then used to copy the data from Row 1 to 5 of PINDEX
* into Row 1 to 5 the new vector, INDX.  The format of the COPY command is:
*
*    COPY fromvar(s) tovar / options
*
* where:  fromvar(s) = list of vectors or a single matrix
*         tovar      = variable into which the fromvar(s) are to be
*                      copied
*         options    = list of desired options
*         FROW=      = specifies the rows of the fromvar(s) that are to
*                      be copied into the tovar
*         TROW=      = specifies the rows of the tovar into which the
*                      fromvar(s) are to be copied.
DIM INDX6 10
COPY PINDEX INDX6 / FROW=1,5 TROW=1,5
*
* Before the Aggregate Laspeyres Price Index for Wheat, Corn and Soybean is
* estimated with the base year 6, BASE=6, the SAMPLE command must be
* specified to change the sample range to 1 10 to ensure all the price and
* quantity data is used in the estimation process.
*
SAMPLE 1 10
INDEX PWHEAT WHEAT PCORN CORN PSOYBEAN SOYBEAN / BASE=6 LASPEYRES=PLS
GENR P2INDEX=PLS*100
*
* The SAMPLE command is used to specify the sample range of 6 10 for the
* Laspeyres Price Index when the base year is 6.
*
SAMPLE 1 5
GENR PINDEX=PINDEX*(100/198.5)
COPY PINDEX P2INDEX / FROW=1,5 TROW=1,5
SAMPLE 6 10
PRINT P2INDEX
COPY P2INDEX INDX6 / FROW=6,10 TROW=6,10
*
* To replicate Column F and H of Figure 17.5, the SAMPLE command needs to be
* changed back to 1 10 from 6 10.
*
SAMPLE 1 10
PRINT YEAR INDX6 P2INDEX
*
* Figure 17.6, p. 663.
*
PLOT INDX6 YEAR
*
* Figure 17.7, p. 663.
*
PLOT PINDEX YEAR
*
*----------------------------------------------------------------------------
* A Nonparametric Test for Randomness, p. 665
*
SAMPLE 1 16
READ DAY VOLUME / LIST
 1   98
 2   93
 3   82
 4  103
 5  113
 6  111
 7  104
 8  103
 9  114
10  107
11  111
12  109
13  109
14  108
15  128
16   92
*
* The median observation of the volume data is calculated using the MEDIAN=
* option on the STAT command.  The median value is saved in a constant
* called M.
*
STAT VOLUME / MEDIAN=M
PRINT M
*
* There are 2 ways in computing the Runs Test.  In the textbook, Newbold uses
* the residuals around the median.  In SHAZAM, the Runs Test is calculated
* with the residuals around the mean.  The most common way in calculating
* the Runs Test is using the residuals around the mean.
*
* SHAZAM automatically computes the Runs Test when the OLS command is 
* specified with the RSTAT option.  The LIST option is used to list and print
* out the residuals about the Mean in SHAZAM.  In the textbook, the residuals
* around the Median.  Therefore, the GENR command is used to generate a
* constant vector of coefficients of 107.5.  The INCOEF= option is used on 
* the OLS command to specify the vector of coefficients to input.  The
* printed residuals is a visual check for the number of residuals that
* are above and below the median of 107.5.  
*
GENR MEDV=107.5
OLS VOLUME / INCOEF=MEDV RSTAT LIST
*
* Figure 17.8 is replicated with the PLOT command.
*
PLOT VOLUME DAY
*
DELETE / ALL
*-----------------------------------------------------------------------------
* Example 17.1, p. 667
*
* The TIME command specifies the beginning year and frequency for a time
* series.  This is an alternate form of the SAMPLE command.
*
TIME 1931 1
SAMPLE 1931.0 1960.0
READ(PINKHAMSD.DIF) / DIF LIST
STAT SALES / MEDIAN=K
PRINT K
*
OLS SALES / RSTAT LIST
*
*----------------------------------------------------------------------------
* Components of a Time Series, p. 668
*
TIME 1946 4
SAMPLE 1946.1 2000.2
READ(MACRO2000.DIF) / DIF
STAT
*
* The original data set has missing values for some of the variables.  In
* order for SHAZAM to read this file, the missing values were replaced with
* -99999.  This is the default value for a missing data in SHAZAM.  The
* SET SKIPMISS command is used to turn on automatic deletion of missing
* observations in the subsequent commands.  The SET NOWARNMISS command
* turns off the the messages about missing observations.
*
SET SKIPMISS
SET NOWARNMISS
*
* Figure 17.11, p. 669
*
PLOT GDPH OBS
GRAPH GDPH OBS
*
*----------------------------------------------------------------------------
* Moving Averages, p. 671
*
* Recall in Example 17.1, the Lydia Pinkham data was read into SHAZAM.
* Therefore, it is not necessary to read in this data set again.  However,
* the TIME and SAMPLE commands will be specified since this data set is
* different than the one previously used.
*
TIME 1931.0 1
SAMPLE 1931.0 1960.0
*
* The TIME(0) function is used to create a time index so that the first
* observation is equal to 1 and the rest are consecutively numbered.
*
GENR T=TIME(0)
*
* The 5-Point Centered Moving Average for the SALES variable, SMOOTHED, is
* calculated using the GENR command and LAG function.  The LAG(x,n) function
* lags the variable x, n times.  Using a negative value for n on the LAG(x,n)
* function will lead future variables.
*
GENR SMOOTHED=(LAG(SALES,2)+LAG(SALES)+SALES+LAG(SALES,-1)+LAG(SALES,-2))/5
GENR ACTUAL=SALES
*
* Print Table 17.4, p. 672
*
* Note:  In Table 17.4, the variable AVER1 is called SMOOTHED in SHAZAM.
*        The vector for SALES has been renamed to ACTUAL with the GENR
*        command so Figure 17.13 is easier to identify in SHAZAM.
*
PRINT YEAR SALES SMOOTHED
*
* Replicate Figure 17.13, p. 673
*
PLOT ACTUAL SMOOTHED T
GRAPH ACTUAL SMOOTHED T
*
DELETE / ALL
*-----------------------------------------------------------------------------
* Extraction of the Seasonal Component Through Moving Averages, p. 673
*
SAMPLE 1 32
READ YEAR X
1.1      0.300
1.2      0.460
1.3      0.345
1.4      0.910
2.1      0.330
2.2      0.545
2.3      0.440
2.4      1.040
3.1      0.495
3.2      0.680
3.3      0.545
3.4      1.285
4.1      0.550
4.2      0.870
4.3      0.660
4.4      1.580
5.1      0.590
5.2      0.990
5.3      0.830
5.4      1.730
6.1      0.610
6.2      1.050
6.3      0.920
6.4      2.040
7.1      0.700
7.2      1.230
7.3      1.060
7.4      2.320
8.1      0.820
8.2      1.410
8.3      1.250
8.4      2.730
*
* The 4-Point Moving Average, FPTMA, for the Earnings variable, X, is
* calculated using the GENR command and LAG function.  The LAG(x,n) function
* lags the variable x, n times.  Using a negative value for n on the LAG(x,n)
* function will lead future variables.
*
GENR FPTMA=(LAG(X,2)+LAG(X,1)+X+LAG(X,-1))/4
*
* The Centered 4-Point Moving Average, C4PMA, for the Earnings variable, X, is
* calculated using the GENR command and LAG function.  The LAG(x,n) function
* lags the variable x, n times.  Using a negative value for n on the LAG(x,n)
* function will lead future variables.  This average is calculated using 3
* separate GENR statements to ensure there is no confusion.
*
GENR P1=(LAG(X,2)+LAG(X,1)+X+LAG(X,-1))/4
GENR P2=(LAG(X,1)+X+LAG(X,-1)+LAG(X,-2))/4
*
* The SAMPLE command is used to change the range of the data from 1 30 to
* 3 30 since the data was lagged back 2 time periods.
*
SAMPLE 3 30
GENR C4PMA=(P1+P2)/2
*
* Table 17.5 is replicated with the three PRINT commands since the data
* lengths for YEAR, EARNINGS, FPTMA and C4PMA are not identical.  The SAMPLE
* command is used before each PRINT command to ensure the desired data is
* printed only.
*
SAMPLE 1 32
GENR EARNINGS=X
PRINT YEAR EARNINGS
SAMPLE 3 31
PRINT FPTMA
SAMPLE 3 30
PRINT C4PMA
*
*-----------------------------------------------------------------------------
* The SAMPLE command is used to change the range to 3 31 in calculating 
* Column 4 of Table 17.6, p. 677
* 
SAMPLE 3 31
GENR COL4=(X/C4PMA)*100
PRINT COL4
*
* The GENR command with the SUM and SEAS function is used to create an index
* called CSINDEX to represent each cross-section.  A repeating time index
* called TINDEX is created with the GENR command for the 4 observations.
*
SAMPLE 1 32
GENR CSINDEX=SUM(SEAS(4))
GENR TINDEX=TIME(0)-4*(CSINDEX-1)
PRINT CSINDEX TINDEX COL4
*
* The sample range is changed to include only observations 3 to 30 in
* calculating the median of each quarter with the STAT command.  The DO
* command creates a DO-loop to execute the 3 commands immediately following.
* The first command skips all observations where the variable TINDEX not
* equal to 1.  If TINDEX is equal to 1 then the STAT command is executed.
* The descriptive statistics of the variable COL4is printed.  The PMEDIAN
* option prints the median, mode and quartiles for variable COL4.  The
* MEDIAN= option stores the median value in a constant.  Then the DELETE
* SKIP$ command permanently eliminates all the SKIPIF commands in effect.
* The ENDO command indicates the end of the DO-loop.
*
SAMPLE 3 30
DO #=1,4
SKIPIF(TINDEX.NE.#)
STAT COL4 / PMEDIAN MEDIAN=M#
DELETE SKIP$
ENDO
*
* The GEN1 command is used to generate the constant for the sum of the
* median values.
*
GEN1 MEDSUM=M1+M2+M3+M4
*
* The sample range is reset to 1 32 to calculate the Seasonal Index.  The
* DO-loop is used to calculate the Seasonal Index of each quarter in
* Table 17.6.
*
SAMPLE 1 32
DO #=1,4
GEN1 SINDEX#=M#*400/MEDSUM
PRINT SINDEX#
ENDO
*
* The SET NOWARNSKIP command is used to suppress the printing of the warning
* message as to which observations will be skipped.  The Adjusted Series, AS,
* values is generated with the GENR command within a DO-loop.
*
DO #=1,4
SET NOWARNSKIP
SKIPIF(TINDEX.NE.#)
GENR AS=X*(100/SINDEX#)
DELETE SKIP$
ENDO
*
* The Adjusted Series data in Table 17.6 is printed with the PRINT command.
* Notice, this command is specified after the Do-loop has ended.  If the
* PRINT command was specified within the DO-loop the values for AS would be
* printed each time the DO-loop was executed.  The numbers are not identical
* to the textbook since more significant digits were used by SHAZAM in the
* calculation.
*
PRINT AS
*
* The PLOT command is used to replicate Figure 17.15, p. 678
*
PLOT AS YEAR 
GRAPH AS YEAR
*
DELETE / ALL
*-----------------------------------------------------------------------------
* Exponential Smoothing, p. 682
*
* The TIME command specifies the beginning year and frequency for a time
* series.  This is an alternate form of the SAMPLE command.
*
TIME 1931 1
SAMPLE 1931.0 1960.0
READ(PINKHAMSD.DIF) / DIF 
STAT YEAR SALES 
*
* The GENR command and TIME(0) function is used to create a time index so
* that the first observation is equal to 1 and the rest are consecutively
* numbered.
*
GENR T=TIME(0)
*
* The GENR command is used to generate a vector of zeros for the variable X
* and XHAT.  
*
GENR X=0
GENR XHAT=0
*
* The GEN1 command is used to generate the constant F to equal 1806.  The
* GENR command and LAG(x,n) function lags the variable x, n times.  Using a
* negative value for n on the LAG(x,n) function will lead future variables.
*
GEN1 F=1806
GENR L=LAG(SALES,-1)
*
* The DO command provides repeat operations.  The statements between the
* DO and ENDO commands are repeatedly executed.  In this example, the GEN1
* function is executed 30 times.  The first DO-loop executes:
*
*         X:1=(0.40*F)+(0.60*L:1)
*
* once this value has been calculated the second GEN1 command is executed:
*
*         F=X:1
*
* so that the previously calculated value for XB:1 becomes the new value of
* the constant F.  The ENDO in the following line indicates that all the
* commands have been executed and the DO-loop returns to the beginning.  Now
* the DO-loop executes:
*
*         X:2=(0.40*F)+(0.60*L:2)
*         F=X:2
*
* and continued to repeat this DO-loop until it has reached 30.
*
DO #=1,30
GEN1 X:#=(0.40*F)+(0.60*L:#)
GEN1 F=X:#
ENDO
*
* The PRINT command is used to print out the values for X.  
*
PRINT T X
*
* The COPY command is used to copy vectors or matrices into other matrices.
* As well, it is possible to partition matrices, delete rows and columns and
* create mactrices from vectors using the COPY command.  The FROW= option
* specifies the beginning and ending row the data is to be copied from.  The
* TROW= option specifies the beginning and ending row the data is to be
* copied to.  In this example, the first COPY command copies Row 1 of the
* vector SALES to Row 1 of the vector XBAR.  The second COPY command copies
* Row 1 to 29 of the vector X to Row 2 to 30 of the vector XHAT. 
*
COPY SALES XHAT / FROW=1;1 TROW=1;1
COPY X XHAT / FROW=1;29 TROW=2;30
*
* The PRINT command is used to out the values.
*
PRINT T SALES XHAT
*
* Plot Figure 17.17, p. 682
*
PLOT SALES T / YMIN=1100 YMAX=2700 NOPRETTY
GRAPH SALES T
*
*-----------------------------------------------------------------------------
* Example 17.3, p. 692
*
SAMPLE 1 30
*
* The GENR command is used with the LAG(x) function to generate the variables
* lagged SALES one time period (SALELAG1), lagged SALES two time periods
* (SALELAG2), lagged SALES three time periods (SALELAG3) and lagged SALES
* four time periods (SALELAG4).
*
GENR SALELAG1=LAG(SALES)
GENR SALELAG2=LAG(SALES,2)
GENR SALELAG3=LAG(SALES,3)
GENR SALELAG4=LAG(SALES,4)
*
* Figure 17.22, p. 693
*
* The sample range of the first-order model must be changed from 2 30 since
* the first observation is lost when the SALES variable was lagged one time
* period.  The first-order model is estimated with the OLS command.
*
* Regression with p=1
*
SAMPLE 2 30
OLS SALES SALELAG1
*
* The second-order model is estimated with the OLS command but the sample
* range is changed accordingly as the first two observations are lost in the
* lagging process of the SALES variable.  The COEF= option is used to save
* the regression estimates in the vector called COEF.  These values will
* be used in forecasting the X31.
*
* Regression with p=2
*
SAMPLE 3 30
OLS SALES SALELAG1 SALELAG2 / COEF=COEF
*
* The third-order model is estimated and the sample range is changed
* accordingly as the first three observations are lost in the lagging process
* of the SALES variable.
*
* Regression with p=3
*
SAMPLE 4 30
OLS SALES SALELAG1 SALELAG2 SALELAG3
*
* The fourth-order model is estimated and the sample range is changed
* accordingly as the first four observations are lost in the lagging process
* of the SALES variable.
*
* Regression with p=4
*
SAMPLE 5 30
OLS SALES SALELAG1 SALELAG2 SALELAG3 SALELAG4
*
* The regression coefficients for the second-order model were saved in the
* vector COEF.  The CONSTANT is saved in COEF:3, the regression estimate for
* X lagged one time period is saved in COEF:1 and the regression estimate for
* X lagged two time periods is saved in COEF:2.  The sales figures for X29
* and X30 are stored in the vector SALES in SALES:29 and SALES:30.  The GEN1
* command is used to forecast the value of X when t=31.
*
GEN1 X31=COEF:3+(COEF:1*SALES:30)+(COEF:2*SALES:29)
PRINT X31 COEF:3 COEF:1 SALES:30
PRINT COEF:2 SALES:29
*
* Similarly, SALES can be forecasted when t=32 using the GEN1 command.
*
GEN1 X32=COEF:3+(COEF:1*X31)+(COEF:2*SALES:30)
PRINT X32 
*
DELETE / ALL
*-----------------------------------------------------------------------------
*
STOP