ParseSASDatalines: Parse a SAS Dataline Command

Description

A parser for simple SAS dataline command texts. A data.frame is being built with the columnnames listed in the input section. The data object will be created in the given environment.

Usage

ParseSASDatalines(x, env = .GlobalEnv, overwrite = FALSE)

Arguments

the SAS text

env

environment in which the dataset should be created.

overwrite

logical. If set to TRUE, the function will silently overwrite a potentially existing object in env with the same name as declared in the SAS DATA section. If set to FALSE (default) an error will be raised if there already exists an object with the same name.

Value

Details

The SAS function DATA is designed for quickly creating a dataset from scratch. The whole step normally consists out of the DATA part defining the name of the dataset, an INPUT line declaring the variables and a DATALINES command followed by the values. The default delimiter used to separate the different variables is a space (thus each variable should be one word). The $ after the variable name indicates that the variable preceding contain character values and not numeric values. Without specific instructions, SAS assumes that variables are numeric. The function will fail, if it encounters a character in the place of an expected numeric value. Each new row in datalines will create a corresponding unique row in the dataset. Notice that a ; is not needed after every row, rather it is included at the end of the entire data step.

More complex command structures, i.e. other delimiters (dlm), in the INPUT-section are not (yet) supported.

Examples

Run this code

txt <- "
DATA asurvey;
INPUT id sex $ age inc r1 r2 r3 ;
DATALINES;
1   F  35 17  7 2 2
17  M  50 14  5 5 3
33  F  45  6  7 2 7
49  M  24 14  7 5 7
65  F  52  9  4 7 7
81  M  44 11  7 7 7
2   F  34 17  6 5 3
18  M  40 14  7 5 2
34  F  47  6  6 5 6
50  M  35 17  5 7 5
;
"

(d.frm <- ParseSASDatalines(txt))