SDA 4.1 Documentation for DDLtoX


NAME

DDLtoX - Convert a DDL file to SAS, SPSS, or Stata definitions or to XML (for DDI)

USAGE

ddltox [-option] -i input_file

DESCRIPTION

(Currently the program is limited to converting DDL files that define fixed-format data files, not CSV or TSV files.)

DDLtoX converts the data descriptions in a DDL file into data definitions for SAS, SPSS, or Stata. That file, together with a matching data file, can then be used to create a system file for one of those systems and to run data analysis procedures. The program can also generate variable definitions in XML, following the conventions of the Data Documentation Initiative (DDI-Codebook version).

By default, DDLtoX creates SAS data definitions. To create definitions for SPSS, Stata, or XML, use the `-x' option.

DDLtoX obtains the data description information from the DDL file given after the `-i' flag on the command line. Since SAS, SPSS, and Stata do not define the attributes of variables in exactly the same way as DDL, there are occasionally some problems in converting from DDL to one of those systems. Some of the options of DDLtoX are designed to facilitate those conversions.

Warnings and error messages are placed in a file named `DDLTOX.MSG'. Users should view the contents of that file after running the program.


CONTENTS OF THIS DOCUMENT


MEANING OF THE OPTIONS

Define Input and Output Files

-x system
Type of output to produce -- SAS, SPSS, Stata, or XML.
(Default is SAS; other systems must be specified either in capital letters or in lowercase.)

-i fname
Take input from the DDL file `fname'. (This specification is REQUIRED.)

-o fname
Write the new data definitions onto the file `fname' (instead of to the standard output).

Options for Variable Names

-@ character(s)
Replace any '@' character in a variable name with the specified conversion character(s). Any '@' characters in category labels will also be converted, so that GOTO information will reflect the revised variable names.
(This is useful for dealing with names from CASES instruments, especially when converting to SAS or Stata definitions.)

-m number
Override the default maximum for the length of each variable name (currently 32 characters) and set it to the number given after `-m'. The new maximum must be between 8 and 32.

If a variable name exceeds the maximum length, the corresponding variable definitions are skipped, and a warning is placed in the `DDLTOX.MSG' file.

-p prefix
Use the characters given after `-p' as a prefix to each variable name. (This is useful if variable names begin with a number, and they are to be converted to SAS, SPSS, or Stata.)

Options for Category Labels

-s
Ignore short (bracketed) category labels -- only for SAS, SPSS, and Stata output
(See the discussion on short/long category labels below.)

-n max_characters
Maximum number of characters to output as a short category label -- XML output only
(Default is 60)
(See discussion on short/long labels for XML output below. )

Other Options

-v fname
Limit the variables processed to those contained in the variable list file `fname'.

-h
Display short program help and available options. (The program will not do anything else.)


MISSING DATA CONVERSION

Each system has its own method of indicating which codes are to be considered invalid and therefore to be excluded from data analyses.

DDLtoX will attempt to convert as many missing-data specifications as it can for each variable. If there is something that cannot be converted, a warning is placed into the `DDLTOX.MSG' file.

SAS missing-data specifications

All numeric missing-data specifications in the DDL file are converted to IF-statements in SAS. Each such statement sets the referenced variable to the value `.', if the value in the data file matches the missing-data condition.

If character missing-data codes are used for numeric variables, a special missing-data statement is included which applies to all variables in the SAS definitions.

SPSS missing-data specifications

The first three discreet missing-data codes in the DDL for a variable are put into an SPSS `MISSING VALUES' statement. Any additional missing-data or valid range specifications for numeric variables are converted to IF-statements.

Character missing-data codes for numeric variables are not converted. If any are encountered, a warning is given.

Stata missing-data specifications

All numeric missing-data specifications in the DDL file are converted to REPLACE-IF statements in Stata. Each such statement sets the referenced variable to the value `.', if the value in the data file matches the missing-data condition.

Character missing-data codes for numeric variables are not converted. If any are encountered, a warning is given.

XML missing-data specifications

The DDI specification can handle all of the DDL missing-data conventions except for character missing-data codes for numeric variables. If any of those specifications are encountered, a warning is given.


SHORT/LONG CATEGORY LABELS for SAS, SPSS, or Stata (-s)

In a DDL file there can be a long label or text description for a response code, plus a short label given in square brackets. For example the category label specification could look like:
Consistently votes Democrat [Democrat].
where the long label or text is `Consistently votes Democrat', and the short label is `Democrat'.

Short/long category labels for XML (-n)

In the DDI specification, there are two XML specifications for the labels of categories -- the 'labl' element, and the 'text' element. In general, the 'labl' element is intended to be used as a shorter label for statistical analysis programs, whereas the 'text' element is intended to be used as a longer explanation of the meaning of a particular category.

In a DDL file the short version of a category label is put between square brackets after the longer category text. An example of such a label would be:

Definitely will vote in the next election [Definitely vote]
where the short label is `Definitely vote'.

EXAMPLES

ddltox -i myDDL -o mySAS -@_

Convert DDL to SAS definitions. Convert '@' in variable names to underscore symbols ('_').

ddltox -x spss -i myddl -o mySPSS -s

Convert DDL to SPSS definitions. Do not use bracketed short category labels found in the DDL file.

ddltox -x xml -n 40 -i myddl -o myXML

Convert DDL to XML definitions following the DDI specification; category labels longer than 40 characters will be put out using the 'txt' element instead of the 'labl' element.

SEE ALSO

DDI Data Documentation Initiative - Codebook version
DDL Data Description Language
xconvert Convert SAS, SPSS, or Stata defintions into DDI (XML) or DDL


CSM, UC Berkeley/ISA
June 4, 2019