umx (version 1.9.1)

umx_residualize: Easily residualize variables in long or wide dataframes, returning them changed in-place.

Description

Residualize one or more variables residualized against covariates, and return a complete dataframe with residualized variable in place. Optionally, this also works on wide (i.e., twin) data. Just supply suffixes to identify the paired-wide columns (see examples).

Usage

umx_residualize(var, covs = NULL, suffixes = NULL, data)

Arguments

var

The base name of the variable you want to residualize. Alternatively, a regression formula containing var on the lhs, and covs on the rhs

covs

Covariates to residualize on.

suffixes

Suffixes that identify the variable for each twin, i.e. c("_T1", "_T2") Up to you to check all variables are present!

data

The dataframe containing all the variables

Value

- dataframe with var residualized in place (i.e under its original column name)

Details

In R, residuals for a variable can be found with the following statement:

tmp <- residuals(lm(var ~ cov1 + cov2, data = data, na.action = na.exclude))

This tmp variable could then be written over the old data:

umx_residualize obviates the user having to build the lm, set na.action, or replace the data. In addition, it has the powerful feature of operating on a list of variables, and of operating on wide data, expanding the var name using a set of variable-name suffixes.

References

- http://tbates.github.io, https://github.com/tbates/umx

See Also

Other Data Functions: umxCovData, umxFactor, umxHetCor, umxPadAndPruneForDefVars, umx_as_numeric, umx_cont_2_quantiles, umx_cov2raw, umx_long2wide, umx_lower2full, umx_make_MR_data, umx_make_TwinData, umx_make_bin_cont_pair_data, umx_make_fake_data, umx_merge_CIs, umx_read_lower, umx_reorder, umx_round, umx_scale_wide_twin_data, umx_scale, umx_swap_a_block, umx_wide2long, umx

Examples

Run this code
# NOT RUN {
# Residualize mpg on cylinders and displacement
r1 = umx_residualize("mpg", c("cyl", "disp"), data = mtcars)
r2 = residuals(lm(mpg ~ cyl + disp, data = mtcars, na.action = na.exclude))
all(r1$mpg == r2)
# =====================
# = formula interface =
# =====================
r1 = umx_residualize(mpg ~ cyl + I(cyl^2) + disp, data = mtcars)
r2 = residuals(lm(mpg ~ cyl + I(cyl^2) + disp, data = mtcars, na.action = na.exclude))
all(r1$mpg == r2)

# ========================================================================
# = Demonstrate ability to residualize WIDE data (i.e. 1 family per row) =
# ========================================================================
tmp = mtcars
tmp$mpg_T1  = tmp$mpg_T2  = tmp$mpg
tmp$cyl_T1  = tmp$cyl_T2  = tmp$cyl
tmp$disp_T1 = tmp$disp_T2 = tmp$disp
umx_residualize("mpg", c("cyl", "disp"), c("_T1", "_T2"), data = tmp)[1:5,12:17]

# ===================================
# = Residualize several DVs at once =
# ===================================
df1 = umx_residualize(c("mpg", "hp"), cov = c("cyl", "disp"), data = tmp)
df2 = residuals(lm(hp ~ cyl + disp, data = tmp, na.action = na.exclude))
all(df1$hp == df2)
# }

Run the code above in your browser using DataCamp Workspace