umxACE: umxACE

Description

Make a 2-group ACE Cholesky Twin model. umxACE supports a core model in behavior genetics, known as the ACE Cholesky model (Cardon and Neale, 1996). #' This model decomposes phenotypic variance into Additive genetic, unique environmental (E) and, optionally, either common or shared-environment (C) or non-additive genetic effects (D). Scroll down to details for how to use the function, a figure and multiple examples. A common task in twin modelling involves using the genetic and environmental differences between large numbers of pairs of mono-zygotic (MZ) and di-zygotic (DZ) twins reared together to model the genetic and environmental structure of one, or, typically, several phenotypes (measured behaviors).

Usage

umxACE(name = "ACE", selDVs, selCovs = NULL, dzData, mzData, suffix = NULL, dzAr = 0.5, dzCr = 1, addStd = TRUE, addCI = TRUE, numObsDZ = NULL, numObsMZ = NULL, boundDiag = NULL, weightVar = NULL, equateMeans = TRUE, bVector = FALSE, thresholds = c("deviationBased", "left_censored"), autoRun = getOption("umx_auto_run"))

Arguments

name

The name of the model (defaults to"ACE")

selDVs

The variables to include from the data

selCovs

(optional) covariates to include from the data (do not include suffix in names)

dzData

The DZ dataframe

mzData

The MZ dataframe

suffix

The suffix for twin 1 and twin 2, often "_T". If set, simplifies SelDVs, i.e., just "dep" not c("dep_T1", "dep_T2")

dzAr

The DZ genetic correlation (defaults to .5, vary to examine assortative mating)

dzCr

The DZ "C" correlation (defaults to 1: set to .25 to make an ADE model)

addStd

Whether to add the algebras to compute a std model (defaults to TRUE)

addCI

Whether to add intervals to compute CIs (defaults to TRUE)

numObsDZ

= Number of DZ twins: Set this if you input covariance data

numObsMZ

= Number of MZ twins: Set this if you input covariance data

boundDiag

= (optional) lbound for diagonal of the a, c, and e matrices.

weightVar

= If provided, a vector objective will be used to weight the data. (default = NULL)

equateMeans

Whether to equate the means across twins (defaults to TRUE)

bVector

Whether to compute row-wise likelihoods (defaults to FALSE)

thresholds

How to implement ordinal thresholds c("deviationBased", "left_censored")

autoRun

Whether to mxRun the model (default TRUE: the estimated model will be returned)

Value

- mxModel of subclass mxModel.ACE

Details

This model decomposes phenotypic variance into Additive genetic, unique environmental (E) and, optionally, either common or shared-environment (C) or non-additive genetic effects (D). This latter restriction emerges due to a lack of degrees of freedom to simultaneously model C and D with only MZ and DZ twin pairs ref?. The Cholesky or lower-triangle decomposition allows a model which is both sure to be solvable, and also to account for all the variance (with some restrictions) in the data. This model creates as many latent A C and E variables as there are phenotypes, and, moving from left to right, decomposes the variance in each component into successively restricted factors. The following figure shows how the ACE model appears as a path diagram:

ACE.png

Data Input The function flexibly accepts raw data, and also summary covariance data (in which case the user must also supple numbers of observations for the two input data sets).

Ordinal Data In an important capability, the model transparently handles ordinal (binary or multi-level ordered factor data) inputs, and can handle mixtures of continuous, binary, and ordinal data in any combination. An experimental feature is under development to allow Tobit modelling.

The function also supports weighting of individual data rows. In this case, the model is estimated for each row individually, then each row likelihood is multiplied by its weight, and these weighted likelyhoods summed to form the model-likelihood, which is to be minimised. This feature is used in the non-linear GxE model functions.

Additional features The umxACE function supports varying the DZ genetic association (defaulting to .5) to allow exploring assortative mating effects, as well as varying the DZ “C” factor from 1 (the default for modelling family-level effects shared 100 to .25 to model dominance effects.

References

- http://www.github.com/tbates/umx

Examples

Run this code


# ===========================
# = Univariate model of BMI =
# ===========================
require(umx)
data(twinData) # ?twinData set from Australian twins.
# Pick the variables
selDVs = c("bmi1", "bmi2")
mzData <- twinData[twinData$zyg == 1,][1:80,] # 80 pairs for speed
dzData <- twinData[twinData$zyg == 3,][1:80,]
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData)
umxSummary(m1, std = FALSE)
plot(m1)

# =========================================
# = ADE model (DZ correlation set to .25) =
# =========================================
m2 = umxACE("ADE", selDVs = selDVs, dzData = dzData, mzData = mzData, dzCr = .25)
umxCompare(m2, m1) # ADE is better
umxSummary(m2, compare = m1) # nb: though this is ADE, columns are labeled ACE


# =====================================
# = Bivariate height and weight model =
# =====================================
data(twinData)
selDVs = c("ht", "wt") # umx will add suffix (in this case "") + "1" or '2'
mzData <- twinData[twinData$zygosity %in% c("MZFF", "MZMM"),]
dzData <- twinData[twinData$zygosity %in% c("DZFF", "DZMM", "DZOS"), ]
mzData <- mzData[1:80,] # quicker run to keep CRAN happy
dzData <- dzData[1:80,]
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, suffix = '')
umxSummary(m1)

# ================================================================
# = Well done! Now you can make and compare twin models in R :-) =
# ================================================================


# ===================
# = Ordinal example =
# ===================
require(umx)
data(twinData)
# Cut bmi colum to form ordinal obesity variables
ordDVs = c("obese1", "obese2")
selDVs = c("obese")
obesityLevels = c('normal', 'overweight', 'obese')
cutPoints <- quantile(twinData[, "bmi1"], probs = c(.5, .2), na.rm = TRUE)
twinData$obese1 <- cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 <- cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
# Make the ordinal variables into mxFactors (ensure ordered is TRUE, and require levels)
twinData[, ordDVs] <- mxFactor(twinData[, ordDVs], levels = obesityLevels)
mzData <- twinData[twinData$zyg == 1, umx_paste_names(selDVs, "", 1:2)]
dzData <- twinData[twinData$zyg == 3, umx_paste_names(selDVs, "", 1:2)]
mzData <- mzData[1:80,] # just top 80 pairs to run fast
dzData <- dzData[1:80,]
str(mzData) # make sure mz, dz, and t1 and t2 have the same levels!
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, suffix = '')
umxSummary(m1)

# ============================================
# = Bivariate continuous and ordinal example =
# ============================================
data(twinData)
selDVs = c("wt", "obese")
# Cut bmi column to form ordinal obesity variables
ordDVs = c("obese1", "obese2")
obesityLevels = c('normal', 'overweight', 'obese')
cutPoints <- quantile(twinData[, "bmi1"], probs = c(.5, .2), na.rm = TRUE)
twinData$obese1 <- cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 <- cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
# Make the ordinal variables into mxFactors (ensure ordered is TRUE, and require levels)
twinData[, ordDVs] <- mxFactor(twinData[, ordDVs], levels = obesityLevels)
mzData <- twinData[twinData$zyg == 1,] # umxACE can trim out unused variables on its own
dzData <- twinData[twinData$zyg == 3,]
mzData <- mzData[1:80,] # just top 80 so example runs in a couple of secs
dzData <- dzData[1:80,]
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, suffix = '')
plot(m1)

# =======================================
# = Mixed continuous and binary example =
# =======================================
require(umx)
data(twinData)
# Cut to form category of 20% obese subjects
cutPoints <- quantile(twinData[, "bmi1"], probs = .2, na.rm = TRUE)
obesityLevels = c('normal', 'obese')
twinData$obese1 <- cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 <- cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
# Make the ordinal variables into mxFactors (ensure ordered is TRUE, and require levels)
ordDVs = c("obese1", "obese2")
twinData[, ordDVs] <- mxFactor(twinData[, ordDVs], levels = obesityLevels)
selDVs = c("wt", "obese")
mzData <- twinData[twinData$zyg == 1, umx_paste_names(selDVs, "", 1:2)]
dzData <- twinData[twinData$zyg == 3, umx_paste_names(selDVs, "", 1:2)]
mzData <- mzData[1:80,] # just top 80 so example runs in a couple of secs
dzData <- dzData[1:80,]
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, suffix = '')
umxSummary(m1)

# ===================================
# Example with covariance data only =
# ===================================

require(umx)
data(twinData)
selDVs = c("wt1", "wt2")
mz = cov(twinData[twinData$zyg == 1, selDVs], use = "complete")
dz = cov(twinData[twinData$zyg == 3, selDVs], use = "complete")
m1 = umxACE(selDVs=selDVs, dzData=dz, mzData=mz, numObsDZ=nrow(dzData), numObsMZ=nrow(mzData))
umxSummary(m1)
plot(m1)
# =========================================================
# Example with Tobin data (this is not profuction quality =
# Definately a work in progress!!!                        =
# =========================================================

# Analyse a dataset in which only people with a BMI over 22 had their BMI recorded

require(umx)
data(twinData)
baseNames = c("bmi")
selDVs = umx_paste_names(baseNames, "", 1:2)
tmp = twinData[, c(selDVs, "zyg")]
tmp$bmi1[tmp$bmi1 <= 22] = 22
tmp$bmi2[tmp$bmi2 <= 22] = 22
tmp = umxFactor(tmp) # BAD! letting it know that there are wide data!
tmp = umxFactor(tmp, sep = "") # GOOD same factor for each twin!
mz = tmp[tmp$zyg == 1, selDVs]
dz = tmp[tmp$zyg == 3, selDVs]
x  = umxThresholdMatrix(dz, sep = "", thresholds = "left_censored", verbose = TRUE)
m1 = umxACE(selDVs = baseNames, dzData = dz, mzData = mz, suffix = "", thresholds = "left_censored")
umxSummary(m1)
plot(m1)

Run the code above in your browser using DataLab