Performs several RSA model tests on a data set with two predictors
RSA(
formula,
data = NULL,
center = "none",
scale = "none",
na.rm = FALSE,
out.rm = TRUE,
breakline = FALSE,
models = "default",
cubic = FALSE,
verbose = TRUE,
add = "",
estimator = "MLR",
se = "robust",
missing = NA,
control.variables = c(),
center.control.variables = FALSE,
...
)
A formula in the form z ~ x*y
, specifying the variable names used from the data frame, where z is the name of the response variable, and x and y are the names of the predictor variables.
A data frame with the variables
Method for centering the predictor variables before the analysis. Default option ("none") applies no centering. "pooled" centers the predictor variables on their pooled sample mean, which preserves the commensurability of the predictor scales. "variablewise" centers the predictor variables on their respective sample mean. You should think carefully before applying the "variablewise" option, as centering the predictor variables at different values (e.g., their respective means) can affect the commensurability of the predictor scales.
Method for scaling the predictor variables before the analysis. Default option ("none") applies no scaling. "pooled" scales the predictor variables on their pooled sample SD, which preserves the commensurability of the predictor scales. "variablewise" scales the predictor variables on their respective sample SD. You should think carefully before applying the "variablewise" option, as scaling the predictor variables at different values (e.g., their respective SDs) can affect the commensurability of the predictor scales.
Remove missings before proceeding?
Should outliers according to Bollen & Jackman (1980) criteria be excluded from the analyses? In large data sets this analysis is the speed bottleneck. If you are sure that no outliers exist, set this option to FALSE for speed improvements.
Should the breakline in the unconstrained absolute difference model be allowed (the breakline is possible from the model formulation, but empirically rather unrealistic ...). Defaults to FALSE
A vector with names of all models that should be computed. Should be any from c("absdiff", "absunc", "diff", "mean", "additive", "IA", "SQD", "RR", "SRR", "SRRR", "SSQD", "SRSQD", "full", "null", "onlyx", "onlyy", "onlyx2", "onlyy2", "cubic", "CA", "RRCA", "CL", "RRCL")
. For models="all"
, all models are computed, for models="default"
all models besides absolute difference models are computed.
Should the cubic models with the additional terms Y^3, XY^2, YX^2, and X^3 be included?
Should additional information during the computation process be printed?
Additional syntax that is added to the lavaan model. Can contain, for example, additional constraints, like "p01 == 0; p11 == 0"
Type of estimator that should be used by lavaan. Defaults to "MLR", which provides robust standard errors, a robust scaled test statistic, and can handle missing values. If you want to reproduce standard OLS estimates, use estimator="ML"
and se="standard"
Type of standard errors. This parameter gets passed through to the sem
function of the lavaan
package. See options there. By default, robust SEs are computed. If you use se="boot"
, lavaan
provides CIs and p-values based on the bootstrapped standard error. If you use confint(..., method="boot")
, in contrast, you get CIs and p-values based on percentile bootstrap (see also confint.RSA
).
Handling of missing values (this parameter is passed to the lavaan
sem
function). By default (missing=NA
), Full Information Maximum Likelihood (FIML) is employed in case of missing values. If cases with missing values should be excluded, use missing = "listwise"
.
A string vector with variable names from data
. These variables are added as linear predictors to the model (in order "to control for them"). No interactions with the other variables are modeled.
Should the control variables be centered before analyses? This can improve interpretability of the intercept, which will then reflect the predicted outcome value at the point (X,Y)=(0,0) when all control variables take their respective average values.
Additional parameters passed to the lavaan
sem
function.
Even if the main variables of the model are normally distributed, their squared terms and interaction terms are necessarily non-normal. By default, the RSA function uses a scaled test statistic (test="Satorra-Bentler"
) and robust standard errors (se="robust"
), which are robust against violations of the normality assumption.
Why does my standard polynomial regression give different p-values and SEs than the RSA package? Shouldn't they be the same? This is due to the robust standard errors employed in the RSA package. If you set estimator="ML"
and se="standard"
, you get p-values that are very close to the standard approach. (They might still not be identical because the standard regression approach usually uses an OLS estimator and RSA uses an ML estimator).
Experimental feature (use with caution!): You can also fit binary outcome variables with a probit link function. For that purpose, the response variable has to be defined as "ordered", and the lavaan
estimator changed to "WLSMV": r1.binary <- RSA(z.binary~x*y, df, ordered="z.binary", estimator="WLSMV", model="full")
(for more details see the help file of the sem
function in the lavaan
package.). The results can also be plotted with probabilities on the z axis using the probit link function: plot(r1.binary, link="probit", zlim=c(0, 1), zlab="Probability")
. For plotting, the binary outcome variable must be coded with 0 and 1 (not as a factor).
lavaan
at the moment only supports a probit link function for binary outcomes, not a logit link. Please be aware that this experimental feature can fit the full model, but most other functions (such as model comparisons) might break and errors might show up.
Edwards, J. R. (2002). Alternatives to difference scores: Polynomial regression analysis and response surface methodology. In F. Drasgow & N. W. Schmitt (Eds.), Advances in measurement and data analysis (pp. 350–400). San Francisco, CA: Jossey-Bass.
Humberg, S., Nestler, S., & Back, M. D. (2019). Response Surface Analysis in Personality and Social Psychology: Checklist and Clarifications for the Case of Congruence Hypotheses. Social Psychological and Personality Science, 10(3), 409–419. doi:10.1177/1948550618757600
Humberg, S., Schönbrodt, F. D., Back, M. D., & Nestler, S. (in press). Cubic response surface analysis: Investigating asymmetric and level-dependent congruence effects with third-order polynomial models. Psychological Methods. doi:10.1037/met0000352
Nestler, S., Humberg, S., & Schönbrodt, F. D. (2019). Response surface analysis with multilevel data: Illustration for the case of congruence hypotheses. Psychological Methods, 24(3), 291–308. doi:10.1037/met0000199
Schönbrodt, F. D. (2016). Testing fit patterns with polynomial regression models. Retrieved from osf.io/3889z
Schönbrodt, F. D., Humberg, S., & Nestler, S. (2018). Testing similarity effects with dyadic response surface analysis. European Journal of Personality, 32(6), 627-641. doi:10.1002/per.2169
demoRSA
, plotRSA
, RSA.ST
, confint.RSA
, compare