SHAZAM Data file

Data file


A SHAZAM data file contains a set of numeric observations on a group of variables.

Sources of data files

  • Data files may be obtained by on-line retrieval from the internet. Attention must be given to respecting licence agreements and acknowledging data sources.

  • Data may be collected from a survey. In this case, it may be necessary to type the data into a data file. A text editor can be used to prepare the data file.

  • Data files may be provided by other researchers. Some academic journals maintain archives of data sets that have been used in publications.

  • Data files may be created by SHAZAM. Variables can be constructed with SHAZAM commands and the WRITE command can be used to write the new data set to a data file.

Examples

  1. The Theil textile data set
  2. A household food expenditure data set

Rules for preparing SHAZAM data files

The standard format for a SHAZAM data file requires that the file be prepared as a plain text file with numbers separated by spaces or commas. Free format is allowed. That is, there are no constraints on column position.

Note that a comma is treated as a separator. Therefore, the number 12,560 will be interpreted as two numbers: 12 and 560. For correct interpretation by SHAZAM, commas in numeric data should be removed. This can be done in an editor with a global edit change.

In general, there must be no descriptive information and no special characters of any kind embedded in the data file (an exception to this is when the FORMAT command is used - see below). Data documentation can be placed as a header to the file or at the very end of the file (this is discussed in further detail in the section on comment statements).

Spread-sheet data files can used with one of the following methods:

  • Convert the spreadsheet to a plain text file (an ASCII file) by using the Save As ... option from the File menu.
  • Save the spreadsheet in DIF format. DIF files can be loaded with the SHAZAM READ command. Instructions are in the SHAZAM User's Reference Manual.
  • Microsoft Excel XLS files can be read by SHAZAM. Instructions are available.

Two data preparation styles are permitted:

  1. NOBYVAR - Observation by observation. Observations for all variables begin on a new line. The Theil textile data set is prepared in this way.

  2. BYVAR - Variable by variable. All observations for each new variable begin on a new line.

Example:   Consider a data set with observations on height (inches) and weight (pounds) for 6 individuals (this example is from Chotikapanich and Griffiths, 1993). The data set prepared with the observation by observation method is:

   69    112  
   56     84  
   62    102  
   67    135  
   70    165  
   63    120  

The data set prepared with the BYVAR (variable by variable) method is:

   69  56  62  67  70  63
  112  84  102  135  165  120

Data can be prepared in more than one data file. Then, multiple READ commands can be used to load the complete data set into SHAZAM memory.

Special formats

Character data in data files can be read using the FORMAT command. When this command is used the data cannot be in free format. More details are in the SHAZAM User's Reference Manual.

Missing values

Missing values should be assigned a numeric missing value code.

References

D. Chotikapanich and W.E. Griffiths, Learning SHAZAM: A Computer Handbook for Econometrics, Wiley, 1993.


Home [SHAZAM Guide home]