ma_r: Master framework for meta-analysis of correlations

Description

This is the master function for meta-analyses of correlations - it facilitates the computation of bare-bones, artifact-distribution, and individual-correction meta-analyses of correlations for any number of construct pairs. When artifact-distribution meta-analyses are performed, this function will automatically extract the artifact information from a database and organize it into the requested type of artifact distribution object (i.e., either Taylor series or interactive artifact distributions). This function is also equipped with the capability to clean databases containing inconsistently recorded artifact data, to impute missing artifacts (when individual-correction meta-analyses are requested), and remove dependency among samples by forming composites or averaging effect sizes and artifacts. The automatic compositing features are employed when sample_ids and/or construct names are provided. When multiple construct pairs are meta-analyzed, the result of this function takes on the class ma_master, which means that it is a list of meta-analyses. Follow-up analyses (e.g., sensitity, heterogeneity, meta-regression) performed on ma_master objects will analyze data from all meta-analyses recorded in the object.

Usage

ma_r(rxyi, n, n_adj = NULL, sample_id = NULL, ma_method = "bb",
  ad_type = "tsa", correction_method = "auto", construct_x = NULL,
  construct_y = NULL, measure_x = NULL, measure_y = NULL,
  construct_order = NULL, wt_type = "sample_size", error_type = "mean",
  correct_bias = TRUE, correct_rel = NULL, correct_rxx = TRUE,
  correct_ryy = TRUE, correct_rr = NULL, correct_rr_x = TRUE,
  correct_rr_y = TRUE, indirect_rr = NULL, indirect_rr_x = TRUE,
  indirect_rr_y = TRUE, rxx = NULL, rxx_restricted = TRUE,
  rxx_type = "alpha", ryy = NULL, ryy_restricted = TRUE,
  ryy_type = "alpha", ux = NULL, ux_observed = TRUE, uy = NULL,
  uy_observed = TRUE, sign_rz = NULL, sign_rxz = 1, sign_ryz = 1,
  conf_level = 0.95, cred_level = 0.8, conf_method = "t",
  cred_method = "t", var_unbiased = TRUE, moderators = NULL,
  cat_moderators = TRUE, moderator_type = "simple", pairwise_ads = FALSE,
  residual_ads = TRUE, check_dependence = TRUE,
  collapse_method = "composite", intercor = 0.5, clean_artifacts = TRUE,
  impute_artifacts = ifelse(ma_method == "ad", FALSE, TRUE),
  impute_method = "bootstrap_mod", decimals = 2, hs_override = FALSE,
  use_all_arts = FALSE, supplemental_ads = NULL, data = NULL, ...)

Arguments

rxyi

Vector or column name of observed correlations

Vector or column name of sample sizes.

n_adj

Optional: Vector or column name of sample sizes adjusted for sporadic artifact corrections.

sample_id

Optional vector of identification labels for samples/studies in the meta-analysis.

ma_method

Method to be used to compute the meta-analysis: "bb" (barebones), "ic" (individual correction), or "ad" (artifact distribution).

ad_type

For when ma_method is "ad", specifies the type of artifact distribution to use: "int" or "tsa".

correction_method

Character scalar or a square matrix with the collective levels of construct_x and construct_y as row names and column names. When ma_method is "ad", select one of the following methods for correcting artifacts: "auto", "meas", "uvdrr", "uvirr", "bvdrr", "bvirr", "rbOrig", "rb1Orig", "rb2Orig", "rbAdj", "rb1Adj", and "rb2Adj". (note: "rb1Orig", "rb2Orig", "rb1Adj", and "rb2Adj" can only be used when Taylor series artifact distributions are provided and "rbOrig" and "rbAdj" can only be used when interative artifact distributions are provided). See "Details" of ma_r_ad for descriptions of the available methods.

construct_x

Vector of construct names for construct initially designated as X.

construct_y

Vector of construct names for construct initially designated as Y.

measure_x

Vector of names names for measures associated with constructs initially designated as "X".

measure_y

Vector of names names for measures associated with constructs initially designated as "Y".

construct_order

Vector indicating the order in which variables should be arranged, with variables listed earlier in the vector being preferred for designation as X.

wt_type

Type of weight to use in the meta-analysis: options are "sample_size", "inv_var_mean" (inverse variance computed using mean effect size), and "inv_var_sample" (inverse variance computed using sample-specific effect sizes). Supported options borrowed from metafor are "DL", "HE", "HS", "SJ", "ML", "REML", "EB", and "PM" (see metafor documentation for details about the metafor methods).

error_type

Method to be used to estimate error variances: "mean" uses the mean effect size to estimate error variances and "sample" uses the sample-specific effect sizes.

correct_bias

Logical scalar that determines whether to correct correlations for small-sample bias (TRUE) or not (FALSE).

correct_rel

Optional named vector that supercedes correct_rxx and correct_ryy. Names should correspond to construct names in construct_x and construct_y to determine which constructs should be corrected for unreliability.

correct_rxx

Logical scalar or vector that determines whether to correct the X variable for measurement error (TRUE) or not (FALSE).

correct_ryy

Logical scalar or vector that determines whether to correct the Y variable for measurement error (TRUE) or not (FALSE).

correct_rr

Optional named vector that supercedes correct_rr_x and correct_rr_y. Names should correspond to construct names in construct_x and construct_y to determine which constructs should be corrected for range restriction.

correct_rr_x

Logical scalar, logical vector or column name determining whether each correlation in rxyi should be corrected for range restriction in X (TRUE) or not (FALSE). If using artifact distribution methods, this must be a scalar value.

correct_rr_y

Logical scalar, logical vector or column name determining whether each correlation in rxyi should be corrected for range restriction in Y (TRUE) or not (FALSE). If using artifact distribution methods, this must be a scalar value.

indirect_rr

Optional named vector that supercedes indirect_rr_x and indirect_rr_y. Names should correspond to construct names in construct_x and construct_y to determine which constructs should be corrected for indirect range restriction.

indirect_rr_x

Logical vector or column name determining whether each correlation in rxyi should be corrected for indirect range restriction in X (TRUE) or not (FALSE). Superceded in evaluation by correct_rr_x (i.e., if correct_rr_x == FALSE, the value supplied for indirect_rr_x is disregarded).

indirect_rr_y

Logical vector or column name determining whether each correlation in rxyi should be corrected for indirect range restriction in Y (TRUE) or not (FALSE). Superceded in evaluation by correct_rr_y (i.e., if correct_rr_y == FALSE, the value supplied for indirect_rr_y is disregarded).

rxx

Vector or column name of reliability estimates for X.

rxx_restricted

Logical vector or column name determining whether each element of rxx is an incumbent reliability (TRUE) or an applicant reliability (FALSE).

rxx_type, ryy_type

String vector identifying the types of reliability estimates supplied. Acceptable reliability types are:

internal_consistency A generic designation for internal-consistency reliability estimates derived from responses to a single test administration.
multiple_administrations A generic designation for reliability estimates derived from multiple administrations of a test.
alpha Coefficient alpha.
lambda Generic designation for a Guttman's lambda coefficient.
lambda1 Guttman's lambda 1 coefficient.
lambda2 Guttman's lambda 2 coefficient.
lambda3 Guttman's lambda 3 coefficient.
lambda4 Guttman's lambda 4 coefficient.
lambda5 Guttman's lambda 5 coefficient.
lambda6 Guttman's lambda 6 coefficient.
omega Omega coefficient indicating the proportion variance in a variable accounted for by modeled latent factors.
icc Intraclass correlation coefficient.
interrater_r Inter-rater correlation coefficient.
interrater_r_sb Inter-rater correlation coefficient, stepped up with the Spearman-Brown formula.
splithalf Split-half reliability coefficient.
splithalf_sb Split-half reliability coefficient, corrected toward the full test length with the Spearman-Brown formula.
retest Test-retest reliability coefficient.
parallel Parallel-forms reliability coefficient with tests taken during the same testing session.
alternate Alternate-forms reliability coefficient with tests taken during the same testing session.
parallel_delayed Parallel-forms reliability coefficient with tests taken during separate testing sessions with a time delay in between.
alternate_delayed Alternate-forms reliability coefficient with tests taken during separate testing sessions with a time delay in between.

ryy

Vector or column name of reliability estimates for Y.

ryy_restricted

Logical vector or column name determining whether each element of ryy is an incumbent reliability (TRUE) or an applicant reliability (FALSE).

Vector or column name of u ratios for X.

ux_observed

Logical vector or column name determining whether each element of ux is an observed-score u ratio (TRUE) or a true-score u ratio (FALSE).

Vector or column name of u ratios for Y.

uy_observed

Logical vector or column name determining whether each element of uy is an observed-score u ratio (TRUE) or a true-score u ratio (FALSE).

sign_rz

Optional named vector that supercedes sign_rxz and sign_ryz. Names should correspond to construct names in construct_x and construct_y to determine the sign of each construct's relationship with the selection mechanism.

sign_rxz

Sign of the relationship between X and the selection mechanism (for use with bvirr corrections only).

sign_ryz

Sign of the relationship between Y and the selection mechanism (for use with bvirr corrections only).

conf_level

Confidence level to define the width of the confidence interval (default = .95).

cred_level

Credibility level to define the width of the credibility interval (default = .80).

conf_method

Distribution to be used to compute the width of confidence intervals. Available options are "t" for t distribution or "norm" for normal distribution.

cred_method

Distribution to be used to compute the width of credibility intervals. Available options are "t" for t distribution or "norm" for normal distribution.

var_unbiased

Logical scalar determining whether variances should be unbiased (TRUE) or maximum-likelihood (FALSE).

moderators

Matrix or column names of moderator variables to be used in the meta-analysis (can be a vector in the case of one moderator).

cat_moderators

Logical scalar or vector identifying whether variables in the moderators argument are categorical variables (TRUE) or continuous variables (FALSE).

moderator_type

Type of moderator analysis: "none" means that no moderators are to be used, "simple" means that moderators are to be examined one at a time, and "hierarchical" means that all possible combinations and subsets of moderators are to be examined.

pairwise_ads

Logical value that determines whether to compute artifact distributions in a construct-pair-wise fashion (TRUE) or separately by construct (FALSE, default).

residual_ads

Logical argument that determines whether to use residualized variances (TRUE) or observed variances (FALSE) of artifact distributions to estimate sd_rho.

check_dependence

Logical scalar that determines whether database should be checked for violations of independence (TRUE) or not (FALSE).

collapse_method

Character argument that determines how to collapase dependent studies. Options are "composite" (default), "average," and "stop."

intercor

The intercorrelation(s) among variables to be combined into a composite. Can be a scalar or a named vector with element named according to the names of constructs.

clean_artifacts

If TRUE, mutliple instances of the same contruct (or construct-measure pair, if measure is provided) in the database are compared and reconciled with each other in the case that any of the matching entries within a study have different artifact values. When impute_method is anything other than "stop", this method is always implemented to prevent discrepancies among imputed values.

impute_artifacts

If TRUE, artifact imputation will be performed (see impute_method for imputation procedures). Default is FALSE for artifact-distribution meta-analyses and TRUE otherwise. When imputation is performed, clean_artifacts is treated as TRUE so as to resolve all rescrepancies among artifact entries before and after impuation.

impute_method

Method to use for imputing artifacts. Choices are:

bootstrap_mod Select random values from the most specific moderator categories available (default).
bootstrap_full Select random values from the full vector of artifacts.
simulate_mod Generate random values from the distribution with the mean and variance of observed artifacts from the most specific moderator categories available. (uses rnorm for u ratios and rbeta for reliability values).
simulate_full Generate random values from the distribution with the mean and variance of all observed artifacts (uses rnorm for u ratios and rbeta for reliability values).
wt_mean_mod Replace missing values with the sample-size weighted mean of the distribution of artifacts from the most specific moderator categories available (not recommended).
wt_mean_full Replace missing values with the sample-size weighted mean of the full distribution of artifacts (not recommended).
unwt_mean_mod Replace missing values with the unweighted mean of the distribution of artifacts from the most specific moderator categories available (not recommended).
unwt_mean_full Replace missing values with the unweighted mean of the full distribution of artifacts (not recommended).
replace_unity Replace missing values with 1 (not recommended).
stop Stop evaluations when missing artifacts are encountered.

If an imputation method ending in "mod" is selected but no moderators are provided, the "mod" suffix will internally be replaced with "full".

decimals

Number of decimal places to which interactive artifact distributions should be rounded (default is 2 decimal places).

hs_override

When TRUE, this will override settings for wt_type (will set to "sample_size"), error_type (will set to "mean"), correct_bias (will set to TRUE), conf_method (will set to "norm"), cred_method (will set to "norm"), and var_unbiased (will set to FALSE).

use_all_arts

Logical scalar that determines whether artifact values from studies without valid effect sizes should be used in artifact distributions (TRUE) or not (FALSE).

supplemental_ads

Named list (named according to the constructs included in the meta-analysis) of supplemental artifact distribution information from studies not included in the meta-analysis. This is a list of lists, where the elements of a list associated with a construct are named like the arguments of the create_ad() function.

data

Data frame containing columns whose names may be provided as arguments to vector arguments and/or moderators.

...

Further arguments to be passed to functions called within the meta-analysis.

Value

A list object of the classes psychmeta, ma_r_as_r, ma_bb (and ma_ic or ma_ad, as appropriate). Components of output tables for bare-bones meta-analyses:

Pair_ID Unique identification number for each construct pairing.
Construct_X Name of the variable analyzed as construct X.
Construct_Y Name of the variable analyzed as construct Y.
Analysis_ID Unique identification number for each moderator analysis within a construct pairing.
Analysis_Type Type of moderator analyses: Overall, Simple Moderator, or Hierarchical Moderator.
k Number of effect sizes meta-analyzed.
N Total sample size of all effect sizes in the meta-analysis.
mean_r Mean observed correlation.
var_r Weighted variance of observed correlations.
var_e Predicted sampling-error variance of observed correlations.
var_res Variance of observed correlations after removing predicted sampling-error variance.
sd_r Square root of var_r.
se_r Standard error of mean_r.
sd_e Square root of var_e.
sd_res Square root of var_res.
CI_LL_XX Lower limit of the confidence interval around mean_r, where "XX" represents the confidence level as a percentage.
CI_UL_XX Upper limit of the confidence interval around mean_r, where "XX" represents the confidence level as a percentage.
CV_LL_XX Lower limit of the credibility interval around mean_r, where "XX" represents the credibility level as a percentage.
CV_UL_XX Upper limit of the credibility interval around mean_r, where "XX" represents the credibility level as a percentage.

Components of output tables for individual-correction meta-analyses:

Pair_ID Unique identification number for each construct pairing.
Construct_X Name of the variable analyzed as construct X.
Construct_Y Name of the variable analyzed as construct Y.
Analysis_ID Unique identification number for each moderator analysis within a construct pairing.
Analysis_Type Type of moderator analyses: Overall, Simple Moderator, or Hierarchical Moderator.
k Number of effect sizes meta-analyzed.
N Total sample size of all effect sizes in the meta-analysis.
mean_r Mean observed correlation.
var_r Weighted variance of observed correlations.
var_e Predicted sampling-error variance of observed correlations.
var_res Variance of observed correlations after removing predicted sampling-error variance.
sd_r Square root of var_r.
se_r Standard error of mean_r.
sd_e Square root of var_e.
sd_res Square root of var_res.
mean_rho Mean artifact-corrected correlation.
var_r_c Variance of artifact-corrected correlations.
var_e_c Predicted sampling-error variance of artifact-corrected correlations.
var_rho Variance of artifact-corrected correlations after removing predicted sampling-error variance.
sd_r_c Square root of var_r_c.
se_r_c Standard error of mean_rho.
sd_e_c Square root of var_e_c.
sd_rho Square root of var_rho.
CI_LL_XX Lower limit of the confidence interval around mean_rho, where "XX" represents the confidence level as a percentage.
CI_UL_XX Upper limit of the confidence interval around mean_rho, where "XX" represents the confidence level as a percentage.
CV_LL_XX Lower limit of the credibility interval around mean_rho, where "XX" represents the credibility level as a percentage.
CV_UL_XX Upper limit of the credibility interval around mean_rho, where "XX" represents the credibility level as a percentage.

Components of output tables for artifact-distribution meta-analyses:

Pair_ID Unique identification number for each construct pairing.
Construct_X Name of the variable analyzed as construct X.
Construct_Y Name of the variable analyzed as construct Y.
Analysis_ID Unique identification number for each moderator analysis within a construct pairing.
Analysis_Type Type of moderator analyses: Overall, Simple Moderator, or Hierarchical Moderator.
k Number of effect sizes meta-analyzed.
N Total sample size of all effect sizes in the meta-analysis.
mean_r Mean observed correlation.
var_r Weighted variance of observed correlations.
var_e Predicted sampling-error variance of observed correlations.
var_art Amount of variance in observed correlations that is attributable to measurement-error and range-restriction artifacts.
var_pre Total predicted artifactual variance (i.e., the sum of var_e and var_art)
var_res Variance of observed correlations after removing predicted sampling-error variance and predicted artifact variance.
sd_r Square root of var_r.
sd_e Square root of var_e.
sd_art Square root of var_art.
sd_pre Square root of var_pre.
sd_res Square root of var_res.
mean_rho Mean artifact-corrected correlation.
var_rho Variance of artifact-corrected correlations after removing predicted sampling-error variance and predicted artifact variance.
sd_rho Square root of var_rho.
CI_LL_XX Lower limit of the confidence interval around mean_rho, where "XX" represents the confidence level as a percentage.
CI_UL_XX Upper limit of the confidence interval around mean_rho, where "XX" represents the confidence level as a percentage.
CV_LL_XX Lower limit of the credibility interval around mean_rho, where "XX" represents the credibility level as a percentage.
CV_UL_XX Upper limit of the credibility interval around mean_rho, where "XX" represents the credibility level as a percentage.

References

Schmidt, F. L., & Hunter, J. E. (2015). Methods of meta-analysis: Correcting error and bias in research findings (3rd ed.). Thousand Oaks, CA: Sage. https://doi.org/10/b6mg. Chapter 3.

Dahlke, J. A., & Wiernik, B. M. (2017). One of these artifacts is not like the others: New methods to account for the unique implications of indirect range-restriction corrections in organizational research. Unpublished manuscript.

Examples

Run this code

# NOT RUN {
## The 'ma_r' function can compute multi-construct bare-bones meta-analyes:
ma_r(rxyi = rxyi, n = n, rxx = rxxi, ryy = ryyi,
     construct_x = x_name, construct_y = y_name, sample_id = sample_id,
     moderators = moderator, data = data_r_meas_multi)

## It can also perform multiple individual-correction meta-analyses:
ma_r(ma_method = "ic", rxyi = rxyi, n = n, rxx = rxxi, ryy = ryyi,
     construct_x = x_name, construct_y = y_name, sample_id = sample_id,
     moderators = moderator, data = data_r_meas_multi)

## And 'ma_r' can also curate artifact distributions and compute multiple
## artifact-distribution meta-analyses:
ma_r(ma_method = "ad", rxyi = rxyi, n = n, rxx = rxxi, ryy = ryyi,
     correct_rr_x = FALSE, correct_rr_y = FALSE,
     construct_x = x_name, construct_y = y_name, sample_id = sample_id,
     moderators = moderator, data = data_r_meas_multi)

## Artifact information from studies not included in the meta-analysis can also be used to make
## corrections. Passing artifact information with the 'supplemental_ads' argument allows for
## additional artifact values and/or means and variances of artifacts to be used.
## The 'supplemental_ads' analysis below gives the same results as the prior meta-analysis.
x_ids <- c(data_r_meas_multi$x_name, data_r_meas_multi$y_name) == "X"
rxxi <- c(data_r_meas_multi$rxxi, data_r_meas_multi$ryyi)[x_ids]
n_rxxi = c(data_r_meas_multi$n, data_r_meas_multi$n)[x_ids]

y_ids <- c(data_r_meas_multi$x_name, data_r_meas_multi$y_name) == "Y"
ryyi <- c(data_r_meas_multi$rxxi, data_r_meas_multi$ryyi)[y_ids]
n_ryyi = c(data_r_meas_multi$n, data_r_meas_multi$n)[y_ids]

z_ids <- c(data_r_meas_multi$x_name, data_r_meas_multi$y_name) == "Z"
rzzi <- c(data_r_meas_multi$rxxi, data_r_meas_multi$ryyi)[z_ids]
n_rzzi = c(data_r_meas_multi$n, data_r_meas_multi$n)[z_ids]

ma_r(ma_method = "ad", rxyi = rxyi, n = n,
     correct_rr_x = FALSE, correct_rr_y = FALSE,
     construct_x = x_name, construct_y = y_name,
     moderators = moderator, sample_id = sample_id, data = data_r_meas_multi,
     supplemental_ads = list(X = list(rxxi = rxxi, n_rxxi = n_rxxi, wt_rxxi = n_rxxi),
                             Y = list(rxxi = ryyi, n_rxxi = n_ryyi, wt_rxxi = n_ryyi),
                             Z = list(rxxi = rzzi, n_rxxi = n_rzzi, wt_rxxi = n_rzzi)))

## If 'use_all_arts' is set to TRUE, artifacts from studies without valid correlations
## will be used to inform artifact distributions. Below, correlations and artifacts
## are provided by non-overlapping sets of studies.
dat1 <- dat2 <- data_r_meas_multi
dat1$rxxi <- dat1$ryyi <- NA
dat2$rxyi <- NA
dat2$sample_id <- dat2$sample_id + 40
dat <- rbind(dat1, dat2)
ma_r(ma_method = "ad", rxyi = rxyi, n = n, rxx = rxxi, ryy = ryyi,
     correct_rr_x = FALSE, correct_rr_y = FALSE,
     construct_x = x_name, construct_y = y_name,
     sample_id = sample_id, moderators = moderator,
     use_all_arts = TRUE, data = dat)
# }

Run the code above in your browser using DataLab