Learn R Programming

SIMPLE.REGRESSION (version 0.2.3)

OLS_REGRESSION: Ordinary least squares regression

Description

Provides SPSS- and SAS-like output for ordinary least squares simultaneous entry regression and hierarchical entry regression. The output includes the Anova Table (Type III tests), standardized coefficients, partial and semi-partial correlations, collinearity statistics, casewise regression diagnostics, plots of residuals and regression diagnostics. The output includes Bayes Factors and, if requested, regression coefficients from Bayesian Markov Chain Monte Carlo (MCMC) analyses.

Usage

OLS_REGRESSION(data, DV, forced=NULL, hierarchical=NULL, 
               COVARS=NULL,
               plot_type = 'residuals', 
               CI_level = 95,
               MCMC = FALSE,
               Nsamples = 10000,
               verbose=TRUE, ...)

Value

An object of class "OLS_REGRESSION". The object is a list containing the following possible components:

modelMAIN

All of the lm function output for the regression model without interaction terms.

modelMAINsum

All of the summary.lm function output for the regression model without interaction terms.

anova_table

Anova Table (Type III tests).

mainRcoefs

Predictor coefficients for the model without interaction terms.

modeldata

All of the predictor and outcome raw data that were used in the model, along with regression diagnostic statistics for each case.

collin_diags

Collinearity diagnostic coefficients for models without interaction terms.

Arguments

data

A dataframe where the rows are cases and the columns are the variables.

DV

The name of the dependent variable.
Example: DV = 'outcomeVar'

forced

(optional) A vector of the names of the predictor variables for a forced/simultaneous entry regression. The variables can be numeric or factors.
Example: forced = c('VarA', 'VarB', 'VarC')

hierarchical

(optional) A list with the names of the predictor variables for each step of a hierarchical regression. The variables can be numeric or factors.
Example: hierarchical = list(step1=c('VarA', 'VarB'), step2=c('VarC', 'VarD'))

COVARS

(optional) The name(s) of possible covariates variable for a moderated regression analysis.
Example: COVARS = c('CovarA', 'CovarB', 'CovarC')

plot_type

(optional) The kind of plots, if any. The options are:

  • 'residuals' (the default)

  • 'diagnostics' (for regression diagnostics), or

  • 'none' (for no plots).

Example: plot_type = 'diagnostics'

CI_level

(optional) The confidence interval for the output, in whole numbers. The default is 95.

MCMC

(logical) Should Bayesian MCMC analyses be conducted? The default is FALSE.

Nsamples

(optional) The number of samples for MCMC analyses. The default is 10000.

verbose

Should detailed results be displayed in console? The options are: TRUE (default) or FALSE. If TRUE, plots of residuals are also produced.

...

(dots, for internal purposes only at this time.)

Author

Brian P. O'Connor

Details

This function uses the lm function from the stats package, supplements the output with additional statistics, and it formats the output so that it resembles SPSS and SAS regression output. The predictor variables can be numeric or factors.

The Bayesian MCMC analyses can be time-consuming for larger datasets. The MCMC analyses are conducted using functions, and their default settings, from the BayesFactor package (Morey & Rouder, 2024). The MCMC results can be verified using the model checking functions in the rstanarm package (e.g., Muth, Oravecz, & Gabry, 2018).

Good sources for interpreting residuals and diagnostics plots:

References

Bodner, T. E. (2016). Tumble graphs: Avoiding misleading end point extrapolation when graphing interactions from a moderated multiple regression analysis. Journal of Educational and Behavioral Statistics, 41, 593-604.

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.

Darlington, R. B., & Hayes, A. F. (2017). Regression analysis and linear models: Concepts, applications, and implementation. Guilford Press.

Hayes, A. F. (2018a). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (2nd ed.). Guilford Press.

Hayes, A. F., & Montoya, A. K. (2016). A tutorial on testing, visualizing, and probing an interaction involving a multicategorical variable in linear regression analysis. Communication Methods and Measures, 11, 1-30.

Lee M. D., & Wagenmakers, E. J. (2014) Bayesian cognitive modeling: A practical course. Cambridge University Press.

Morey, R. & Rouder, J. (2024). BayesFactor: Computation of Bayes Factors for Common Designs. R package version 0.9.12-4.7, https://github.com/richarddmorey/bayesfactor.

Muth, C., Oravecz, Z., & Gabry, J. (2018). User-friendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology, 14(2), 99119.
https://doi.org/10.20982/tqmp.14.2.p099

O'Connor, B. P. (1998). All-in-one programs for exploring interactions in moderated multiple regression. Educational and Psychological Measurement, 58, 833-837.

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction. (3rd ed.). Wadsworth Thomson Learning.

Examples

Run this code
# forced (simultaneous) entry
head(data_Green_Salkind_2014)
OLS_REGRESSION(data=data_Green_Salkind_2014, DV='injury', 
               forced = c('quads','gluts','abdoms','arms','grip'))
# \donttest{
# hierarchical entry
OLS_REGRESSION(data=data_Green_Salkind_2014, DV='injury', 
               hierarchical = list( step1=c('quads','gluts','abdoms'), 
                                    step2=c('arms','grip')) )
# }

Run the code above in your browser using DataLab