Data file
A SHAZAM data file contains a set of numeric observations on
a group of variables.
Sources of data files
- Data files may be obtained by on-line retrieval from the internet.
Attention must be given to respecting licence agreements and
acknowledging data sources.
- Data may be collected from a survey. In this case, it may
be necessary to type the data into a data file.
A text editor can be used to
prepare the data file.
- Data files may be provided by other researchers.
Some academic journals maintain archives of data sets that
have been used in publications.
- Data files may be created by SHAZAM. Variables can
be constructed with SHAZAM commands and the
WRITE
command can be used to write the new data set to a data file.
Examples
- The Theil textile data set
- A household food expenditure data set
Rules for preparing SHAZAM data files
The standard format for a SHAZAM data file requires that the file
be prepared as a plain text file with numbers separated by spaces or commas.
Free format is allowed.
That is, there are no constraints on column position.
Note that a comma is treated as a separator. Therefore, the number
12,560 will be interpreted as two numbers: 12 and 560.
For correct interpretation by SHAZAM, commas in numeric data should be
removed. This can be done in an editor with a global edit change.
In general, there must be no descriptive information and no
special characters of any kind embedded in the data file
(an exception to this is when the FORMAT command is
used - see below). Data documentation can be placed as a header to the file
or at the very end of the file (this is discussed in further detail
in the section on comment statements).
Spread-sheet data files can used with one of the following methods:
- Convert the spreadsheet to a plain text file (an ASCII file) by
using the
Save As ... option from the File menu.
- Save the spreadsheet in DIF format.
DIF files can be loaded with the SHAZAM
READ command.
Instructions are in the SHAZAM User's Reference Manual.
- Microsoft Excel XLS files can be read by SHAZAM.
Instructions are available.
Two data preparation styles are permitted:
-
NOBYVAR - Observation by observation.
Observations for all variables begin on a new line.
The Theil textile data set
is prepared in this way.
-
BYVAR - Variable by variable.
All observations for each new variable begin on a new line.
Example: Consider a data set with observations on
height (inches) and weight (pounds) for 6 individuals (this example is from
Chotikapanich and Griffiths, 1993).
The data set prepared with the observation by observation method is:
69 112
56 84
62 102
67 135
70 165
63 120
| |
The data set prepared with the BYVAR
(variable by variable) method is:
69 56 62 67 70 63
112 84 102 135 165 120
| |
Data can be prepared in more than one data file. Then,
multiple READ commands can be used to load the
complete data set into SHAZAM memory.
Special formats
Character data in data files can be read using the
FORMAT command. When this command is used the data
cannot be in free format. More details are in the
SHAZAM User's Reference Manual.
Missing values
Missing values should be assigned a numeric
missing value code.
References
D. Chotikapanich and W.E. Griffiths, Learning SHAZAM: A Computer
Handbook for Econometrics, Wiley, 1993.
[SHAZAM Guide home]
|