mxGREMLDataHandler: Helper Function for Structuring GREML Data

Description

This function takes a dataframe or matrix and uses it to setup the 'y' and 'X' matrices for a GREML analysis; this includes trimming out NAs from 'X' and 'y.' The result is a matrix the first column of which is the 'y' vector, and the remaining columns of which constitute 'X.'

Usage

mxGREMLDataHandler(data, yvars=character(0), Xvars=list(), addOnes=TRUE, 
                  blockByPheno=TRUE, staggerZeroes=TRUE)

Arguments

data

Either a dataframe or matrix, with column names, containing the variables to be used as phenotypes and covariates in 'y' and 'X,' respectively.

yvars

Character vector. Each string names a column of the raw dataset, to be used as a phenotype.

Xvars

A list of data column names, specifying the covariates to be used with each phenotype. The list should have the same length as argument yvars.

addOnes

Logical; should lead columns of ones (for the regression intercepts) be adhered to the covariates when assembling the 'X' matrix? Defaults to TRUE.

blockByPheno

Logical; relevant to polyphenotype analyses. If TRUE (default), then the resulting 'y' will contain phenotype #1 for individuals 1 thru n, phenotype #2 for individuals 1 thru n, ... If FALSE, then observations

staggerZeroes

Logical; relevant to polyphenotype analyses. If TRUE (default), then each phenotype's covariates in 'X' are "staggered," and 'X' is padded out with zeroes. If FALSE, then 'X' is formed simply by stacking the phenotypes' covaria

Value

A list with these two components:
yXNumeric matrix. The first column is the phenotype vector, 'y,' while the remaining columns constitutethe 'X' matrix of covariates. If this matrix is used as the raw dataset for a model, then the model's GREML expectation can be constructed with dataset.is.yX=TRUE in mxExpectationGREML().
casesToDropNumeric vector. Contains the indices of the rows of the 'y' and 'X' that were dropped due to containing NA's. Can be provided as as argument casesToDropFromV to mxExpectationGREML().

Details

For a monophenotype analysis (only), argument Xdata can be a character vector. In a polyphenotype analysis, if the same covariates are to be used with all phenotypes, then Xdata can be a list of length 1.

Note the synergy between the output of mxGREMLDataHandler() and arguments dataset.is.yX and casesToDropFromV to mxExpectationGREML().

If the dataframe or matrix supplied for argument data has n rows, and argument yvars is of length p, then the resulting 'y' and 'X' matrices will have np rows. Then, if either matrix contains any NA's, the rows containing the NA's are trimmed from both 'X' and 'y' before being returned in the output (in which case they will obviously have fewer than np rows). Function mxGREMLDataHandler() reports which rows of the full-size 'X' and 'y' were trimmed out due to missing observations. These row indices can be provided as argument casesToDropFromV to mxExpectationGREML().

References

The OpenMx User's guide can be found at http://openmx.psyc.virginia.edu/documentation.

Examples

Run this code

dat <- cbind(rnorm(100),rep(1,100))
colnames(dat) <- c("y","x")
dat[42,1] <- NA
dat[57,2] <- NA
dat2 <- mxGREMLDataHandler(data=dat, yvars="y", Xvars=list("x"),
  addOnes = FALSE)
str(dat2)

Run the code above in your browser using DataLab