umxACE: Build and run a 2-group Cholesky twin model (uni-variate or multi-variate)

Description

Implementing a core task in twin modeling, umxACE models the genetic and environmental structure of one or more phenotypes (measured variables) using the Cholesky ACE model (Cardon and Neale, 1996).

Classical twin modeling uses the genetic and environmental differences among pairs of mono-zygotic (MZ) and di-zygotic (DZ) twins reared together.

The Cholesky decomposes this phenotypic variance into Additive genetic, unique environmental (E) and, optionally, either common or shared-environment (C) or non-additive genetic effects (D). Scroll down to details for how to use the function, a figure and multiple examples.

The Cholesky or lower-triangle decomposition allows a model which is both sure to be solvable, and also to account for all the variance (with some restrictions) in the data. This model creates as many latent A C and E variables as there are phenotypes, and, moving from left to right, decomposes the variance in each component into successively restricted factors. The following figure shows how the ACE model appears as a path diagram:

Usage

umxACE(name = "ACE", selDVs, selCovs = NULL, covMethod = c("fixed",
  "random"), dzData, mzData, suffix = NULL, dzAr = 0.5, dzCr = 1,
  addStd = TRUE, addCI = TRUE, numObsDZ = NULL, numObsMZ = NULL,
  boundDiag = 0, weightVar = NULL, equateMeans = TRUE, bVector = FALSE,
  thresholds = c("deviationBased", "WLS"),
  autoRun = getOption("umx_auto_run"), sep = NULL, optimizer = NULL,
  intervals = FALSE)

Arguments

name

The name of the model (defaults to"ACE").

selDVs

The variables to include from the data: preferably, just "dep" not c("dep_T1", "dep_T2").

selCovs

(optional) covariates to include from the data (do not include suffix in names)

covMethod

How to treat covariates: "fixed" (default) or "random".

dzData

The DZ dataframe.

mzData

The MZ dataframe.

suffix

Allowed as a synonym for "suffix".

dzAr

The DZ genetic correlation (defaults to .5, vary to examine assortative mating).

dzCr

The DZ "C" correlation (defaults to 1: set to .25 to make an ADE model).

addStd

Whether to add the algebras to compute a std model (defaults to TRUE).

addCI

Whether to add intervals to compute CIs (defaults to TRUE).

numObsDZ

= Number of DZ twins: Set this if you input covariance data.

numObsMZ

= Number of MZ twins: Set this if you input covariance data.

boundDiag

= Numeric lbound for diagonal of the a, c, and e matrices. Defaults to 0 since umx version 1.8

weightVar

= If provided, a vector objective will be used to weight the data. (default = NULL).

equateMeans

Whether to equate the means across twins (defaults to TRUE).

bVector

Whether to compute row-wise likelihoods (defaults to FALSE).

thresholds

How to implement ordinal thresholds c("deviationBased", "WLS").

autoRun

Whether to mxRun the model (default TRUE: the estimated model will be returned).

sep

The separator in twin variable names, often "_T", e.g. "dep_T1". Simplifies selDVs.

optimizer

Optionally set the optimizer (default NULL does nothing).

intervals

Whether to run mxCI confidence intervals (default = FALSE)

Value

- mxModel of subclass mxModel.ACE

Details

Data Input The function flexibly accepts raw data, and also summary covariance data (in which case the user must also supple numbers of observations for the two input data sets).

Ordinal Data In an important capability, the model transparently handles ordinal (binary or multi-level ordered factor data) inputs, and can handle mixtures of continuous, binary, and ordinal data in any combination. An experimental feature is under development to allow Tobit modeling.

The function also supports weighting of individual data rows. In this case, the model is estimated for each row individually, then each row likelihood is multiplied by its weight, and these weighted likelihoods summed to form the model-likelihood, which is to be minimized. This feature is used in the non-linear GxE model functions.

Additional features The umxACE function supports varying the DZ genetic association (defaulting to .5) to allow exploring assortative mating effects, as well as varying the DZ “C” factor from 1 (the default for modeling family-level effects shared 100 to .25 to model dominance effects.

note: Only one of C or D may be estimated simultaneously. This restriction reflects the lack of degrees of freedom to simultaneously model C and D with only MZ and DZ twin pairs ref?.

References

- http://www.github.com/tbates/umx

Examples

Run this code

# NOT RUN {
# ============================
# = How heritable is height? =
# ============================
require(umx)
data(twinData) # ?twinData from Australian twins.
# Pick the variables
selDVs = c("ht1", "ht2")
mzData <- twinData[twinData$zygosity %in% "MZFF", ]
dzData <- twinData[twinData$zygosity %in% "DZFF", ]
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData) # -2ll= 9659
umxSummary(m1, std = FALSE) # unstandardized
# tip: with report = "html", umxSummary can print the table to your browser!
plot(m1)

# ========================================================
# = Evidence for dominance ? (DZ correlation set to .25) =
# ========================================================
m2 = umxACE("ADE", selDVs = selDVs, dzData = dzData, mzData = mzData, dzCr = .25)
umxCompare(m2, m1) # ADE is better
umxSummary(m2, comparison = m1) # nb: though this is ADE, columns are labeled ACE

# ==============================
# = Univariate model of weight =
# ==============================

# Things to note:
# 1. This variable has a large variance, but umx picks good starts.
# 2. umxACE can figure out variable names: provide "sep" and "wt" -> "wt1" "wt2"
# 3. umxACE picks the variables it needs from the data.
# 4. You can use boundDiag to lbound a, c, and e at 0 (prevents mirror-solutions).
m1 = umxACE(selDVs = "wt", dzData = dzData, mzData = mzData, sep = "", boundDiag = 0)
# We can modify this model, dropping shared environment, and see a comparison
m2 = umxModify(m1, update = "c_r1c1", comparison = TRUE)
# =====================================
# = Bivariate height and weight model =
# =====================================
data(twinData)
mzData <- twinData[twinData$zygosity %in% c("MZFF", "MZMM"),]
dzData <- twinData[twinData$zygosity %in% c("DZFF", "DZMM", "DZOS"), ]
mzData <- mzData[1:80,] # quicker run to keep CRAN happy
dzData <- dzData[1:80,]
selDVs = c("ht", "wt") # umx will add suffix (in this case "") + "1" or '2'
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, suffix = '')
umxSummary(m1)

# =========================================================
# = Well done! Now you can make modify twin models in umx =
# =========================================================


# ===================
# = Ordinal example =
# ===================
require(umx)
data(twinData)
# Cut BMI column to form ordinal obesity variables
ordDVs = c("obese1", "obese2")
selDVs = c("obese")
obesityLevels = c('normal', 'overweight', 'obese')
cutPoints <- quantile(twinData[, "bmi1"], probs = c(.5, .2), na.rm = TRUE)
twinData$obese1 <- cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 <- cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
# Make the ordinal variables into mxFactors (ensure ordered is TRUE, and require levels)
twinData[, ordDVs] <- mxFactor(twinData[, ordDVs], levels = obesityLevels)
mzData <- twinData[twinData$zyg == 1, umx_paste_names(selDVs, "", 1:2)]
dzData <- twinData[twinData$zyg == 3, umx_paste_names(selDVs, "", 1:2)]
mzData <- mzData[1:80,] # just top 80 pairs to run fast
dzData <- dzData[1:80,]
str(mzData) # make sure mz, dz, and t1 and t2 have the same levels!
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, suffix = '')
umxSummary(m1)

# ============================================
# = Bivariate continuous and ordinal example =
# ============================================
data(twinData)
selDVs = c("wt", "obese")
# Cut BMI column to form ordinal obesity variables
ordDVs = c("obese1", "obese2")
obesityLevels = c('normal', 'overweight', 'obese')
cutPoints <- quantile(twinData[, "bmi1"], probs = c(.5, .2), na.rm = TRUE)
twinData$obese1 <- cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 <- cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
# Make the ordinal variables into mxFactors (ensure ordered is TRUE, and require levels)
twinData[, ordDVs] <- mxFactor(twinData[, ordDVs], levels = obesityLevels)
mzData <- twinData[twinData$zyg == 1,] # umxACE can trim out unused variables on its own
dzData <- twinData[twinData$zyg == 3,]
mzData <- mzData[1:80,] # just top 80 so example runs in a couple of secs
dzData <- dzData[1:80,]
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, suffix = '')
plot(m1)

# =======================================
# = Mixed continuous and binary example =
# =======================================
require(umx)
data(twinData)
# Cut to form category of 20% obese subjects
# and make into mxFactors (ensure ordered is TRUE, and require levels)
cutPoints <- quantile(twinData[, "bmi1"], probs = .2, na.rm = TRUE)
obesityLevels = c('normal', 'obese')
twinData$obese1 <- cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 <- cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
ordDVs = c("obese1", "obese2")
twinData[, ordDVs] <- mxFactor(twinData[, ordDVs], levels = obesityLevels)

selDVs = c("wt", "obese")
mzData <- twinData[twinData$zygosity %in% "MZFF", umx_paste_names(selDVs, "", 1:2)]
dzData <- twinData[twinData$zygosity %in% "DZFF", umx_paste_names(selDVs, "", 1:2)]
# }
# NOT RUN {
m1 = umxACE(selDVs = selDVs, dzData = dzData, mzData = mzData, suffix = '')
umxSummary(m1)
# }
# NOT RUN {
# ===================================
# Example with covariance data only =
# ===================================

require(umx)
data(twinData)
selDVs = c("wt1", "wt2")
mz = cov(twinData[twinData$zyg == 1, selDVs], use = "complete")
dz = cov(twinData[twinData$zyg == 3, selDVs], use = "complete")
m1 = umxACE(selDVs = selDVs, dzData = dz, mzData = mz, numObsDZ=569, numObsMZ=351)
umxSummary(m1)
plot(m1)
# }

Run the code above in your browser using DataLab