A list of the variables defined in the DDL file is written onto the file ’MAKESDA.LST’ whenever MAKESDA is run. If that file already exists (from a previous MAKESDA run), it is overwritten. This list of variables can be very useful for creating a variable list file for the XCODEBK program. Note that variable names longer than 32 characters cannot currently be used by the MAKESDA program to create variables in SDA.
The data description file must be written in the Data Description Language (DDL). The file with DDL can be created with a text editor, or with the QEXTRACT or XCONVERT programs. MAKESDA can also read older DDL files in the format used by the CSA programs.
The DDL file must describe the characteristics both of the overall data file as well as of the individual variables to be converted into SDA variables. The first variable description MUST be for a variable named ‘CASEID’.
If variables are added to an existing SDA dataset, MAKESDA checks the contents of the CASEID variable to make sure that the CASEID value for each case matches the value stored previously in the SDA dataset. It also checks the contents of CASEID if variables are being modified.
Blanks in a character field
When MAKESDA processes character variable values, spaces are automatically "normalized" before being stored as SDA variables. This means:
All-blank fields
There are various things you can do with an input field that is completely blank:
md_c = ""
blank_c = New ContentIf the "New Content" you specify has more characters than are defined in the ’width=’ specification for this variable, the "New Content" will all be stored anyway in the SDA dataset.
By default, the case of a character input field is left as is, and it is stored in SDA as a case-sensitive character variable.
However, in the DDL file specifications for a character variable, you can specify that the input string be converted entirely to upper case or to lower case. This is done by specifying either ’case_c=upper’ or ’case_c=lower’ for a particular variable (or this can also be done globally for all character variables defined in that DDL file).
Unless the case of a character variable really matters, it is often a good idea to force the characters to be all the same case. For example, if you have a character variable for gender, and if the contents are ’M’ or ’m’ for male, and ’F’ or ’f’ for female, you would probably want to make all of the values either upper or lower case. Otherwise, when you use that variable in a table, you will get four rows or columns for gender instead of two.
If you use the ’case_c=’ specification, there are some ramifications:
md_c= REFUSED md_c= Refused md_c= refused
Note, however, that the case conversion applies only to the category code -- and NOT to the category label. In the above example, the label "Don’t know" remains in mixed upper and lower case, regardless of what happens to the category code itself.catlabels= DK Don’t know dk Don’t know Dk Don’t know
Character variables can be used as selection filters in the same way that numeric variables are used. Note, however, that the values of a selection filter variable are NOT case sensitive. Also, leading and trailing blanks are stripped from character codes specified as filter variables, and multiple internal blanks are reduced to a single blank. This is the same as happens to character values before they are stored as SDA variables, so the filter values should match the stored character values unless there is a substantive difference in the codes.
For example, the following filter specifications all have the same effect, regardless of whether the values of the character variable ’state’ have been forced to upper case or to lower case, or have been left as mixed case:
state("New York") state(" NEW YORK ") state("New York")
DDL | Summary of the Data Description Language |
qextract | Convert CASES Q language files into DDL |
xconvert | Convert SAS/SPSS/Stata data definitions into DDL |