psych: A package for personality, psychometric, and psychological research

Description

The psych package has been developed at Northwestern University to include functions most useful for personality and psychological research. Some of the functions (e.g., describe, pairs.panels, error.bars ) are useful for basic descriptive analyses.

Psychometric applications include routines for Very Simple Structure (VSS), Item Cluster Analysis (ICLUST) and principal axes factor analysis (factor.pa), as well as functions to do Schmid Leiman transformations (schmid) to transform a hierarchical factor structure into a bifactor solution and to graph both structures (omega.graph) and to calculate reliability coefficients alpha (score.items, score.multiple.choice), beta (ICLUST) and McDonald's omega (omega and omega.graph).

The score.items, and score.multiple.choice functions may be used to form single or multiple scales from sets of dichotomous, multilevel, or multiple choice items by specifying scoring keys.

Additional functions make for more convenient descriptions of item characteristics. Functions under development include 1 and 2 parameter Item Response measures.

A number of procedures have been developed as part of the Synthetic Aperture Personality Assessment (SAPA) project. These routines facilitate forming and analyzing composite scales equivalent to using the raw data but doing so by adding within and between cluster/scale item correlations. These functions include extracting clusters from factor loading matrices (factor2cluster), synthetically forming clusters from correlation matrices (cluster.cor), and finding multiple correlation from correlation matrices (mat.regress). Functions to generate simulate data with particular structures include circ.sim, item.sim and congeneric.sim. The most recent development version of the package is always available for download as a source file from the repository at http://personality-project.org/r/src/contrib/.

Arguments

Details

The psych package was originally a combination of multiple source files maintained at the http://personality-project.org/r repository: ``useful.r", VSS.r., ICLUST.r, omega.r, etc.``useful.r" is a set of routines for easy data entry (read.clipboard), simple descriptive statistics (describe), and splom plots combined with correlations (pairs.panels, adapted from the help files of pairs). It is now a single package.

The VSS routines allow for testing the number of factors (VSS), showing plots (VSS.plot) of goodness of fit, and basic routines for estimating the number of factors/components to extract by examining the scree plot (VSS.scree) or comparing with the scree of an equivalent matrix of random numbers (VSS.parallel) .

In addition, there are routines for hierarchical factor analysis using Schmid Leiman tranformations (omega, omega.graph) as well as Item Cluster analysis (ICLUST, ICLUST.graph).

The more important functions in the package are for the analysis of multivariate data, with an emphasis upon those functions useful in scale construction of item composites.

When given a set of items from a personality inventory, one goal is to combine these into higher level item composites. This leads to several questions:

1) What are the basic properties of the data? describe reports basic summary statistics (mean, sd, median, mad, range, minimum, maximum, skew, kurtosis, standard error) for vectors, columns of matrices, or data.frames. describe.by provides descriptive statistics, organized by a grouping variable. pairs.panels shows scatter plot matrices (SPLOMs) as well as histograms and the Pearson correlation for scales or items. error.bars will plot variable means with associated confidence intervals.

2) What is the most appropriate number of item composites to form? After finding either standard Pearson correlations, or finding tetrachoric or polychoric correlations using a wrapper (poly.mat) for John Fox's hetcor function, the dimensionality of the correlation matrix may be examined. The number of factors/components problem is a standard question of factor analysis, cluster analysis, or principal components analysis. Unfortunately, there is no agreed upon answer. The Very Simple Structure (VSS) set of procedures has been proposed as on answer to the question of the optimal number of factors. Other procedures (VSS.scree, VSS.parallel, and fa.parallel) also address this question.

3) What are the best composites to form? Although this may be answered using principal components (principal) or factor analysis (factor.pa) and to show the results graphically (fa.graph), it is sometimes more useful to address this question using cluster analytic techniques. (Better yet is to use maximum likelihood factor analysis using factanal from the stats package.) Previous versions of ICLUST (e.g., Revelle, 1979) have been shown to be particularly successful at doing this. Graphical output from ICLUST.graph uses the Graphviz dot language and allows one to write files suitable for Graphviz.

Graphical organizations of cluster and factor analysis output can be done using cluster.plot which plots items by cluster/factor loadings and assigns items to that dimension with the highest loading.

4) How well does a particular item composite reflect a single construct? This is a question of reliability and general factor saturation. Multiple solutions for this problem result in (Cronbach's) alpha (score.items), (Revelle's) Beta (ICLUST), and (McDonald's) omega. Functions to estimate all three of these are included in psych.

5) For some applications, data matrices are synthetically combined from sampling different items for different people. So called Synthetic Aperture Personality Assessement (SAPA) techniques allow the formation of large correlation or covariance matrices even though no one person has taken all of the items. To analyze such data sets, it is easy to form item composites based upon the covariance matrix of the items, rather than original data set. These matrices may then be analyzed using a number of functions (e.g., cluster.cor, factor.pa, ICLUST, principal, mat.regress, and factor2cluster.

6) More typically, one has a raw data set to analyze. score.items will score data sets on multiple scales, reporting the scale scores, item-scale and scale-scale correlations, as well as coefficient alpha and alpha-1. Using a `keys' matrix, scales can have overlapping or independent items. score.multiple.choice scores multiple choice items or converts multiple choice items to dichtomous (0/1) format for other functions.

An additional set of functions generate simulated data to meet certain structural properties. item.sim creates simple structure data, circ.sim will produce circumplex structured data, item.dichot produces circumplex or simple structured data for dichotomous items. These item structures are useful for understanding the effects of skew, differential item endorsement on factor and cluster analytic soutions.

When examining personality items, some people like to discuss them as representing items in a two dimensional space with a circumplex structure. Tests of circumplex fit circ.tests have been developed. When representing items in a circumplex, it is convenient to view them in polar coordinates.

Five data sets are included: bfi represents 25 personality items thought to represent five factors of personality, iqitems has 14 multiple choice iq items. sat.act has data on self reported test scores by age and gender. galton Galton's data set of the heights of parents and their children. peas recreates the original Galton data set of the genetics of sweet peas.

ll{ Package: psych Type: Package Version: 1.0-42 Date: 2008-3-21 License: GPL version 2 or newer } Index:

psych A package for personality, psychometric, and psychological research. Useful data entry and descriptive statistics describe Basic descriptive statistics useful for psychometrics describe.by Find summary statistics by groups headtail combines the head and tail functions for showing data sets read.clipboard shortcut for reading from the clipboard read.clipboard.csv shortcut for reading comma delimited files from clipboard pairs.panels SPLOM and correlations for a data matrix multi.hist Histograms and densities of multiple variables arranged in matrix form skew Calculate skew for a vector, each column of a matrix, or data.frame kurtosi Calculate kurtosis for a vector, each column of a matrix or dataframe geometric.mean Find the geometric mean of a vector or columns of a data.frame harmonic.mean Find the harmonic mean of a vector or columns of a data.frame error.bars Plot means and error bars error.bars.by Plot means and error bars for separate groups error.crosses Two way error bars interp.median Find the interpolated median, quartiles, or general quantiles. table2df Convert a two dimensional table of counts to a matrix or data frame Data reduction through cluster and factor analysis factor.pa Do a principal Axis factor analysis fa.graph Show the results of a factor analysis or principal components analysis graphically principal Do an eigen value decomposition to find the principal components of a matrix fa.parallel Scree test and Parallel analysis ICLUST Apply the ICLUST algorithm ICLUST.graph Graph the output from ICLUST using the dot language ICLUST.rgraph Graph the output from ICLUST using rgraphviz poly.mat Find the polychoric correlations for items (uses J. Fox's hetcor omega Calculate the omega estimate of factor saturation (requires the GPArotation package) omega.graph Draw a hierarchical or Schmid Leiman orthogonalized solution schmid Apply the Schmid Leiman transformation to a correlation matrix score.items Combine items into multiple scales and find alpha score.multiple.choice Combine items into multiple scales and find alpha and basic scale statistics VSS Apply the Very Simple Structure criterion to determine the appropriate number of factors. VSS.parallel Do a parallel analysis to determine the number of factors for a random matrix VSS.plot Plot VSS output VSS.scree Show the scree plot of the factor/principal components VSS.simulate Generate simulated data for the factor model make.hierarchical Generate simulated correlation matrices with hierarchical structure Procedures particularly useful for Synthetic Aperture Personality Assessment alpha.scale Find coefficient alpha for a scale (see also score.items) correct.cor Correct a correlation matrix for unreliability count.pairwise Count the number of complete cases when doing pair wise correlations cluster.cor find correlations of composite variables from larger matrix cluster.loadings find correlations of items with composite variables from a larger matrix eigen.loadings Find the loadings when doing an eigen value decomposition factor.pa Do a Principal Axis factor analysis and estimate factor scores factor2cluster extract cluster definitions from factor loadings factor.congruence Factor congruence coefficient factor.fit How well does a factor model fit a correlation matrix factor.model Reproduce a correlation matrix based upon the factor model factor.residuals Fit = data - model factor.rotate ``hand rotate" factors mat.regress multiple regression from matrix input principal Do an eigen value decomposition to find the principal components of a matrix Functions for generating simulated data sets circ.sim Generate a two dimensional circumplex item structure item.sim Generate a two dimensional simple structrue with particular item characteristics congeneric.sim Generate a one factor congeneric reliability structure phi.demo Create artificial data matrices for teaching purposes Miscellaneous functions fisherz Apply the Fisher r to z transform fisherz2r Apply the Fisher z to r transform paired.r Test for the difference of two paired or two independent correlations r.con Confidence intervals for correlation coefficients r.test Test of significance of r, differences between rs. phi Find the phi coefficient of correlation from a 2 x 2 table phi.demo Demonstrate the problem of phi coefficients with varying cut points phi2poly Given a phi coefficient, what is the polychoric correlation phi2poly.matrix Given a phi coefficient, what is the polychoric correlation polar Convert 2 dimensional factor loadings to polar coordinates. poly.mat Use John Fox's hetcor to create a matrix of correlations from a data.frame or matrix of integer values polychor.matrix Use John Fox's polycor to create a matrix of polychoric correlations from a matrix of Yule correlations Yule Find the Yule Q coefficient of correlation Yule.inv What is the two by two table that produces a Yule Q with set marginals? Yule2phi What is the phi coefficient corresponding to a Yule Q with set marginals? Yule2phi.matrix Convert a matrix of Yule coefficients to a matrix of phi coefficients. Yule2phi.matrix Convert a matrix of Yule coefficients to a matrix of polychoric coefficients. Functions that are under development and not recommended for casual use irt.item.diff.rasch IRT estimate of item difficulty with assumption that theta = 0 irt.person.rasch Item Response Theory estimates of theta (ability) using a Rasch like model Data sets included in the psych package bfi represents 25 personality items thought to represent five factors of personality iqitems 14 multiple choice iq items sat.act Self reported ACT and SAT Verbal and Quantitative scores by age and gender galton Galton's data set of the heights of parents and their children heights Galton's data set of the relationship between height and forearm (cubit) length cubits Galton's data table of height and forearm length peas Galton`s data set of the diameters of 700 parent and offspring sweet peas test.psych Run a test of the major functions on 5 different data sets. Primarily for development purposes. Although the output can be used as a demo of the various functions.

References

A general guide to personality theory and research may be found at the personality-project http://personality-project.org. See also the short guide to R at http://personality-project.org/r. In addition, see An Introduction to Psychometric Theory with applications in R (Revelle, in preparation) at http://personality-project.org/r/book/

Examples

Run this code

#See the separate man pages 
test.psych()

Run the code above in your browser using DataLab