multilevel.cor: Within-Group and Between-Group Correlation Matrix

Description

This function computes the within-group and between-group correlation matrix by calling the sem function in the R package lavaan and provides standard errors, z test statistics, and significance values (p-values) for testing the hypothesis H0: \(\rho\) = 0 for all pairs of variables within and between groups. By default, the function computes the within-group and between-group correlation matrix without standard errors, z test statistics, and significance value.

Usage

multilevel.cor(data, ..., cluster, within = NULL, between = NULL,
               estimator = c("ML", "MLR"), optim.method = c("nlminb", "em"),
               missing = c("listwise", "fiml"), sig = FALSE, alpha = 0.05,
               print = c("all", "cor", "se", "stat", "p"), split = FALSE,
               order = FALSE, tri = c("both", "lower", "upper"), tri.lower = TRUE,
               p.adj = c("none", "bonferroni", "holm", "hochberg", "hommel",
                         "BH", "BY", "fdr"), digits = 2, p.digits = 3,
               as.na = NULL, write = NULL, append = TRUE, check = TRUE,
               output = TRUE)

Value

Returns an object of class misty.object, which is a list with following entries:

call: function call
type: type of analysis
data: data frame specified in data including the group variable specified in cluster
args: specification of function arguments
model.fit: fitted lavaan object (mod.fit)
result: list with result tables, i.e., summary for the specification of the estimation method and missing data handling in lavaan, wb.cor for the within- and between-group correlations, wb.se for the standard error of the within- and between-group correlations, wb.stat for the test statistic of within- and between-group correlations, wb.p for the significance value of the within- and between-group correlations, with.cor for the within-group correlations, with.se for the standard error of the within-group correlations, with.stat for the test statistic of within-group correlations, with.p for the significance value of the within-group correlations, betw.cor for the between-group correlations, betw.se for the standard error of the between-group correlations, betw.stat for the test statistic of between-group correlations, betw.p for the significance value of the between-group correlations

Arguments

data: a data frame.
...: an expression indicating the variable names in data, e.g., multilevel.cor(dat, x1, x2, x3). Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.
cluster: either a character string indicating the variable name of the cluster variable in data, or a vector representing the nested grouping structure (i.e., group or cluster variable).
within: a character vector representing variables that are measured on the within level and modeled only on the within level. Variables not mentioned in within or between are measured on the within level and will be modeled on both the within and between level.
between: a character vector representing variables that are measured on the between level and modeled only on the between level. Variables not mentioned in within or between are measured on the within level and will be modeled on both the within and between level.
estimator: a character string indicating the estimator to be used, i.e., "ML" for maximum likelihood with conventional standard errors and "MLR" for maximum likelihood with Huber-White robust standard errors. The default setting depends on the argument sig, i.e., "ML" is used when specifying sig = FALSE (default) and "MLR" is used when specifying sig = TRUE.
optim.method: a character string indicating the optimizer, i.e., nlminb (default) for the unconstrained and bounds-constrained quasi-Newton method optimizer and "em" for the Expectation Maximization (EM) algorithm.
missing: a character string indicating how to deal with missing data, i.e., "listwise" for listwise deletion or "fiml" (default) for full information maximum likelihood (FIML) method. Note that it takes longer to estimate models while using FIML and using FIML is prone to issues with model convergence, these issues might be resolved by switching to listwise deletion.
sig: logical: if TRUE, statistically significant correlation coefficients are shown in boldface on the console. Note that standard errors, z test statistics, and significance values not provided in the return object when sig = FALSE (default).
alpha: a numeric value between 0 and 1 indicating the significance level at which correlation coefficients are printed boldface when sig = TRUE.
print: a character string or character vector indicating which results to show on the console, i.e. "all" for all results, "cor" for correlation coefficients, "se" for standard errors, "stat" for z test statistics, and "p" for p-values.
split: logical: if TRUE, output table is split in within-group and between-group correlation matrix.
order: logical: if TRUE, variables in the output table are ordered, so that variables specified in the argument between are shown first.
tri: a character string indicating which triangular of the matrix to show on the console when split = TRUE, i.e., both for upper and upper for the upper triangular.
tri.lower: logical: if TRUE (default) and split = FALSE (default), within-group correlations are shown in the lower triangular and between-group correlation are shown in the upper triangular.
p.adj: a character string indicating an adjustment method for multiple testing based on p.adjust, i.e., none (default), bonferroni, holm, hochberg, hommel, BH, BY, or fdr.
digits: an integer value indicating the number of decimal places to be used for displaying correlation coefficients.
p.digits: an integer value indicating the number of decimal places to be used for displaying p-values.
as.na: a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to data but not to cluster.
write: a character string naming a file for writing the output into either a text file with file extension ".txt" (e.g., "Output.txt") or Excel file with file extension ".xlsx" (e.g., "Output.xlsx"). If the file name does not contain any file extension, an Excel file will be written.
append: logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.
check: logical: if TRUE (default), argument specification is checked.
output: logical: if TRUE (default), output is shown on the console.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

Details

Within-Group and Between-Group Variables: The specification of the within-group and between-group variables is in line with the syntax in Mplus. That is, the within argument is used to identify variables in the data frame specified in data that are measured at the individual level and modeled only at the within level. They are specified to have no variance in the between part of the model. The between argument is used to identify the variables in the data frame specified in data that are measured at the cluster level and modeled only at the between level. Variables not mentioned in the arguments within or between are measured at the individual level and will be modeled at both the within and between level.
Estimation Method and Missing Data Handling: The default setting for the argument estimator is depending on the setting of the argument sig. If sig = FALSE (default), maximum likelihood estimation (estimator = "ML") is used, while maximum likelihood with Huber-White robust standard errors (estimator = "MLR") that are robust against non-normality is used when sig = TRUE. In the presence of missing data, full information maximum likelihood (FIML) method (missing = "fiml") is used by default. Note that FIML method cannot deal with within-group variables that have no variance within some clusters. In this cases, the function will switch to listwise deletion. Using FIML method might result in issues with model convergence, which will be resolved by switching to listwise deletion (missing = "listwise").
Optimizer: The lavaan package uses a quasi-Newton optimization method ("nlminb") by default. If the optimizer does not converge, model estimation switches to the Expectation Maximization (EM) algorithm ("nlminb").
Statistical Significance: Statistically significant correlation coefficients can be shown in boldface on the console by specifying sig = TRUE. However, this option is not supported when using R Markdown, i.e., the argument sig will switch to FALSE.
Adjustment Method for Multiple Testing: Adjustment method for multiple testing when specifying the argument p.adj is applied to the within-group and between-group correlation matrix separately.

References

Hox, J., Moerbeek, M., & van de Schoot, R. (2018). Multilevel analysis: Techniques and applications (3rd. ed.). Routledge.

Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). Sage Publishers.

Examples

Run this code

if (FALSE) {

# Load data set "Demo.twolevel" in the lavaan package
data("Demo.twolevel", package = "lavaan")

#----------------------------------------------------------------------------
# Cluster variable specification

# Example 1: Specification using the argument '...'
multilevel.cor(Demo.twolevel, y1, y2, y3, cluster = "cluster")

# Alternative specification with cluster variable 'cluster' in 'data'
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3", "cluster")], cluster = "cluster")

# Alternative specification with cluster variable 'cluster' not in 'data'
multilevel.cor(Demo.twolevel[, c("y1", "y2", "y3")], cluster = Demo.twolevel$cluster)

#----------------------------------------------------------------------------
# Example 2: All variables modeled at both the within and between level
# Highlight statistically significant result at alpha = 0.05
multilevel.cor(Demo.twolevel, y1, y2, y3, sig = TRUE, cluster = "cluster")

# Example 3: Split output table in within-group and between-group correlation matrix.
multilevel.cor(Demo.twolevel, y1, y2, y3, cluster = "cluster", split = TRUE)

# Example 4: Print correlation coefficients, standard errors, z test statistics,
# and p-values
multilevel.cor(Demo.twolevel, y1, y2, y3, cluster = "cluster", sig = TRUE, print = "all")

# Example 5: Print correlation coefficients and p-values
# significance values with Bonferroni correction
multilevel.cor(Demo.twolevel, y1, y2, y3, cluster = "cluster", sig = TRUE,
               print = c("cor", "p"), p.adj = "bonferroni")

#----------------------------------------------------------------------------
# Example 6: Variables "y1", "y2", and "y2" modeled at both the within and between level
# Variables "w1" and "w2" modeled at the cluster level
multilevel.cor(Demo.twolevel, y1, y2, y3, w1, w2, cluster = "cluster",
               between = c("w1", "w2"))

# Example 7: Show variables specified in the argument 'between' first
multilevel.cor(Demo.twolevel, y1, y2, y3, w1, w2, cluster = "cluster",
               between = c("w1", "w2"), order = TRUE)

#----------------------------------------------------------------------------
# Example 8: Variables "y1", "y2", and "y2" modeled only at the within level
# Variables "w1" and "w2" modeled at the cluster level
multilevel.cor(Demo.twolevel, y1, y2, y3, w1, w2, cluster = "cluster",
               within = c("y1", "y2", "y3"), between = c("w1", "w2"))

#----------------------------------------------------------------------------
# Example 9: lavaan model and summary of the multilevel model used to compute the
# within-group and between-group correlation matrix

mod <- multilevel.cor(Demo.twolevel, y1, y2, y3, cluster = "cluster", output = FALSE)

# lavaan model syntax
mod$model

# Fitted lavaan object
lavaan::summary(mod$model.fit, standardized = TRUE)

#----------------------------------------------------------------------------
# Write Results

# Example 10a: Write Results into a text file
multilevel.cor(Demo.twolevel, y1, y2, y3, cluster = "cluster",
               write = "Multilevel_Correlation.txt")

# Example 10b: Write Results into a Excel file
multilevel.cor(Demo.twolevel, y1, y2, y3, cluster = "cluster",
               write = "Multilevel_Correlation.xlsx")
}

Run the code above in your browser using DataLab