Learn R Programming

NlsyLinks (version 1.200)

ValidateOutcomeDataset: Validates the schema of datasets containing outcome variables.

Description

The NlsyLinks handles a lot of the plumbing code needed to transform extracted NLSY datasets into a format that statistical routines can interpret. In some cases, a dataset of measured variables is needed, with one row per subject. This function validates the measured/outcome dataset, to ensure it posses an interpretable schema. For a specific list of the requirements, see Details below.

Usage

ValidateOutcomeDataset(dsOutcome, outcomeNames)

Arguments

dsOutcome
A data frame with the measured variables
outcomeNames
The column names of the measure variables that eventually will be used by a statistical procedure.

Value

  • Returns TRUE if the validation passes. Returns an error (and associated descriptive message) if it false.

Details

The dsOutcome parameter must: 1) Have a non-missing value. 2) Contain at least one row. 3) Contain a column called 'SubjectTag' (case sensitive). 4) Have the SubjectTag column containing only positive numbers. 5) Have the SubjectTag column where all values are unique (ie, two rows/subjects cannot have the same value). The outcomeNames parameter must: 1) Have a non-missing value 2) Contain only column names that are present in the dsOutcome data frame.

Examples

Run this code
library(NlsyLinks) #Load the package into the current R session.
ds <- ExtraOutcomes79
outcomeNames <- c("MathStandardized", "WeightZGenderAge")
ValidateOutcomeDataset(dsOutcome=ds, outcomeNames=outcomeNames) #Returns TRUE.

outcomeNamesBad <- c("MathMisspelled", "WeightZGenderAge")
#ValidateOutcomeDataset(dsOutcome=ds, outcomeNames=outcomeNamesBad) #Throws error.

Run the code above in your browser using DataLab