qgraph: qgraph

Description

This is the main function of qgraph. It is typically used to visualize the relationship between several variables as a network. This function can be used to create various types of networks based on a number of input options. Please see the details section for more detail.

Usage

qgraph( input, ... )

Arguments

input

Can be either a weights matrix or an edgelist. Can also be an object of class "sem" (sem), "mod" (sem), "lavaan" (lavaan), "principal" (psych), "loadings" (stats), "factanal" (stats), "graphNEL" (Rgraphviz) or "pcAlgo" (pcalg)

...

Any additional arguments described below. Also a list with class "qgraph" can be added that contains any of these arguments (this is returned invisibly by the function)

Value

qgraph returns (invisibly) a 'qgraph' object that contains all the arguments used, with the exeption of the 'layout' argument which is set to the final layout used in the graph. This can then be sent to a new qgraph function to use the same arguments in the new plot.
One of the values returned is 'layout.orig', the original (not rescaled) layout, which needs to be used when using constraint layouts.

encoding

UTF8

Using qgraph to plot graphs

The first argument of qgraph(), 'input', is the input. This can be a number of objects but is mainly either a weights matrix or an edgelist. Here we will assume a graph is made of n nodes connected by m edges. qgraph is mainly aimed at visualizing (statistical) relationships between variables as weighted edges. In these edge weights a zero indicates no connection and negative values are comparable in strength to positive values. Many (standardized) statistics follow these rules, the most important example being correlations. In the special case where all edge weights are either 0 or 1 the weights matrix is interpreted as an adjacency matrix and an unweighted graph is made.

a weights matrix is a square n by n matrix in which each row and column represents a node. The element at row i and column j indicates the connection from node i to node j. If the weights matrix is symmetrical an undirected graph is made and if the matrix is asymmetrical a directed graph is made (note that due to floating point errors seemingly symmetrical matrices may actually be asymmetrical). These matrices occur naturally in statistics or can easily be obtained through matrix algebra, the most important example being correlation matrices.

An edgelist can also be used. This is a m by 2 matrix (not a list!) in which each row indicates an edge. The first column indicates the number of the start of the edge and the second column indicates the number of the end of the edge. The number of each node is a unique integer between 1 and n. The total number of nodes will be estimated by taking the highest value of the edgelist. If this is incorrect (there are nodes with no edges beyond the ones already specified) the 'nNodes' argument can be used. If an integer between 1 and n is missing in the edgelist it is assumed to be a node with no edges. To create a weighted graph edge weights can be added as a third column in the edgelist. By default using an edgelist creates a directed graph, but this can be set with the 'directed' argument.

Interpreting graphs

In weighted graphs green edges indicate positive weights and red edges indicate negative weights. The color saturation and the width of the edges corresponds to the absolute weight and scale relative to the strongest weight in the graph. It is possible to set this strongest edge by using the 'maximum' argument. When 'maximum' is set to a value above any absolute weight in the graph that value is considered the strongest edge (this must be done to compare different graphs; a good value for correlations is 1). Edges with an absolute value under the 'minimum' argument are omitted (useful to keep filesizes from inflating in very large graphs).

In larger graphs the above edge settings can become hard to interpret. With the 'cut' argument a cutoff value can be set which splits scaling of color and width. This makes the graphs much easier to interpret as you can see important edges and general trends in the same picture. Edges with absolute weights under the cutoff score will have the smallest width and become more colorful as they approach the cutoff score, and edges with absolute weights over the cutoff score will be full red or green and become wider the stronger they are.

Specifying the layout

The placement of the nodes (i.e. the layout) is specified with the 'layout' argument. It can be manually specified by entering a matrix for this argument. The matrix must have a row for each node and two columns indicating its X and Y coordinate respectively. qgraph plots the nodes on a (-1:1)(-1:1) plane, and the given coordinates will be rescaled to fit this plane unless 'rescale' is FALSE (not recommended). Another option to manually specify the layout is by entering a matrix with more then two columns. This matrix must then consist of zeroes and a number (the order in the weights matrix) for each node indicating it's place. For example:

0 0 2 0 0

1 0 3 0 4

will place node 2 at the top in the center, node 1 at the bottom left corner, node 3 at the bottom in the center and node 4 at the bottom right corner. It is recommended however that one of the integrated layouts is used. 'layout' can be given a character as argument to accomplish that. layout="circular" will simply place all nodes in a circle if the groups argument is not used and in separate circles per group if the groups argument is used (see next section).

The circular layout is convenient to see how well the data conforms to a model, but to show how the data clusters another layout is more appropriate. By specifying layout="spring" the Fruchterman-reingold algorithm (Fruchterman & Reingold, 1991), which has been ported from the SNA package (Butts, 2010), can be used to create a force-directed layout. In principle, what this function does is that each node (connected and unconnected) repulse each other, and connected nodes also attract each other. Then after a number of iterations (500 by default) in which the maximum displacement of each node becomes smaller a layout is achieved in which the distance between nodes correspond very well to the absolute edge weight between those nodes.

A solution to use this function for weighted graphs has been taken from the igraph package (Csardi G & Nepusz T, 2006) in which the same function was ported from the SNA package. New in qgraph are the option to include constraints on the nodes by fixing a coordinate for nodes or reducing the maximum allowed displacement per node. This can be done with the 'layout.par' argument. For more information see qgraph.layout.fruchtermanreingold.

By default, 'layout' is set to "spring" for unweighted and directed graphs and "circular" otherwise.

Specifying a measurement model

A measurement model can be specified with the 'groups' argument. This must be a list in which each element is a vector containing the numbers of nodes that belong together (numbers are taken from the order in the weights matrix). All numbers must be included. If a groups list is specified the "groups" layout can be used to place these nodes together, the nodes in each group will be given a color, and a legend can be plotted (by setting 'legend' to TRUE). The colors will be taken from the 'color' argument, or be generated with the rainbow function.

Output

By default qgraph will plot the graph in a new R window. However the graphs are optimized to be plotted in a PDF file. To easily create a pdf file set the 'filetype' argument to "pdf". 'filename' can be used to specify the filename and folder to output in. 'height' and 'width' can be used to specify the height and width of the image in inches. By default a new R window is opened if the current device is the NULL-device, otherwise the current device is used (note that when doing this 'width' and 'height' still optimize the image for those widths and heights, even though the output screen size isn't affected, this is especially important for directed graphs!).

Furthermore filetype can also be set to numerous other values. Alternatively any output device in R can be used by simply opening the device before calling qgraph and closing it with dev.off() after calling qgraph.

The graphs can also be outputted in an SVG file using the RSVGTipsDevice package (Plate, 2009). An SVG image can be opened in most browsers (firefox and chrome are recommended), and can be used to display tooltips. Each node can be given a tooltip with the 'tooltips' argument. The function qgraph.svg can be used to make a battery of svg pictures with hyperlinks to each other, working like a navigation menu.

IMPORTANT NOTE: RSVGTipsDevice is a 32-bit only package, so SVG functionality is not available in 64bit versions of R.

Finally, the filetype 'tex' can be used. This uses the tikzDevice package to create a LaTeX file that can then be compiled in your LaTeX compiler to create a pdf file. The main benefit of this over plotting directly in a pdf file is that tooltips can be added which can be viewed in several PDF document readers (Adobe Reader is recommended for the best result).

Manual specification of color and width

In qgraph the widths and colors of each edge can also be manually controlled. To directly specify the width of each edge set the 'mode'' argument to "direct". This will then use the absolute edge weights as the width of each edge (negative values can still be used to make red edges). To manually set the color of each edge, set the 'edge.color' argument to a matrix with colors for each edge (when using a weights matrix) or a vector with a color for each edge (when using an edgelist).

Additional information

By default, edges will be straight between two nodes unless there are two edges between two nodes. To overwrite this the 'bidirectional' argument can be set to TRUE, which will turn two edges between two nodes into one bidirectional edge. 'bidirectional' can alsobe a vector with TRUE or FALSE for each edge.

To specify the strength of the curve the argument 'curve' can be used (but only in directional graphs). 'curve' must be given a numerical value that represent an offset from the middle of the straight edge through where the curved edge must be drawn. 0 indicates no curve, and any other value indicates a curve of that strength. A value of 0.3 is recommended for nice curves. This can be either one number or a vector with the curve of each edge.

Nodes and edges can be given labels with the 'labels' and the 'edge.labels' arguments. 'labels' can be set to FALSE to omit labels, TRUE (default) to set labels equal to the node number (order in the weights matrix) or it can be a vector with the label for each node. Edge labels can also be set to FALSE to be omitted (default). If 'edge.labels' is TRUE then the weight of each label is printed. Finally, 'edge.labels' can also be a vector with the label for each edge. If a label (both for edges and nodes) contain an asterisk then the asterisk is omitted and that label is printed in the symbol font (useful to print Greek letters).

A final two things to try: the 'scores' argument can be given a vector with the scores of a person on each variable, which will then be shown using colors of the nodes, And the 'bg' argument can be used to change the background of the graph to another color, or use bg=TRUE for a special background (do set transparency=TRUE when using background colors other then white).

Debugging

If this function crashes for any reason with the filetype argument specified, run:

dev.off()

To shut down the output device!

Details

The qgraph function has a lot of arguments. Mostly the default values of these work well. Because of the amount of arguments the usage of the qgraph function has been reduced by using the ... method for clarity. This does mean that arguments need to be specified by using their exact name. For instance, to specify color="red" you can not use col="red".

Below is a complete list of all the arguments that can be used in qgraph and a detailed guide on how the function can be used.

References

Sacha Epskamp, Angelique O. J. Cramer, Lourens J. Waldorp, Verena D. Schmittmann, Denny Borsboom (2012). qgraph: Network Visualizations of Relationships in Psychometric Data. Journal of Statistical Software, 48(4), 1-18. URL http://www.jstatsoft.org/v48/i04/.

Carter T. Butts (2010). sna: Tools for Social Network Analysis. R package version 2.2-0. http://CRAN.R-project.org/package=sna

Csardi G, Nepusz T (2006). The igraph software package for complex network research, InterJournal, Complex Systems 1695. http://igraph.sf.net

Korbinian Strimmer (2009). fdrtool: Estimation and Control of (Local) False Discovery Rates. R package version 1.2.6. http://CRAN.R-project.org/package=fdrtool

Plate, T. and based on RSvgDevice by T Jake Luciani (2009). RSVGTipsDevice: An R SVG graphics device with dynamic tips and hyperlinks. R package version 1.0-1.

Fruchterman, T. & Reingold, E. (1991). Graph drawing by force-directed placement. Software - Pract. Exp. 21, 1129?1164.

Examples

Run this code

### BIG 5 DATASET ###
# Load big5 dataset:
data(big5)
data(big5groups)

# Correlations:
Q <- qgraph(cor(big5),minimum=0.25,cut=0.4,vsize=2,groups=big5groups,legend=TRUE,borders=FALSE)
title("Big 5 correlations",line=2.5)

# Same graph with spring layout:
Q <- qgraph(Q,layout="spring")
title("Big 5 correlations",line=2.5)

# Same graph with Venn diagram overlay:
qgraph(Q,overlay=TRUE)
title("Big 5 correlations",line=2.5)

# Significance graph (circular):
qgraph(Q,graph="sig",layout="circular")
title("Big 5 correlations (p-values)",line=2.5)

# Significance graph:
qgraph(Q,graph="sig")
title("Big 5 correlations (p-values)",line=2.5)

# Significance graph (distinguishing positive and negative statistics):
qgraph(Q,graph="sig2")
title("Big 5 correlations (p-values)",line=2.5)

# Grayscale graphs:
qgraph(Q,gray=TRUE,layout="circular")
title("Big 5 correlations",line=2.5)

qgraph(Q,graph="sig",gray=TRUE)
title("Big 5 correlations (p-values)",line=2.5)

# Correlations graph with scores of random subject:
qgraph(cor(big5),minimum=0.25,cut=0.4,vsize=2,groups=big5groups,legend=TRUE,borders=FALSE,scores=as.integer(big5[sample(1:500,1),]),scores.range=c(1,5))
title("Test scores of random subject",line=2.5)

# EFA:
big5efa <- factanal(big5,factors=5,rotation="promax",scores="regression")
qgraph(big5efa,groups=big5groups,layout="circle",rotation="promax",minimum=0.2,cut=0.4,vsize=c(1,15),borders=FALSE,asize=0.07,esize=4,vTrans=200,filetype="R",width=7,height=7)
title("Big 5 EFA",line=2.5)

# PCA:
library("psych")
big5pca <- principal(cor(big5),5,rotate="promax")
qgraph(big5pca,groups=big5groups,layout="circle",rotation="promax",minimum=0.2,cut=0.4,vsize=c(1,15),borders=FALSE,asize=0.07,esize=4,vTrans=200)
title("Big 5 PCA",line=2.5)

#### UNWEIGHTED DIRECTED GRAPHS ###
set.seed(1)
adj=matrix(sample(0:1,10^2,TRUE,prob=c(0.8,0.2)),nrow=10,ncol=10)
qgraph(adj)
title("Unweighted and directed graphs",line=2.5)

# Save plot to nonsquare pdf file:
qgraph(adj,filetype='pdf',height=5,width=10)

#### EXAMPLES FOR EDGES UNDER DIFFERENT ARGUMENTS ###
# Create edgelist:
dat.3 <- matrix(c(1:15*2-1,1:15*2),,2)
dat.3 <- cbind(dat.3,round(seq(-0.7,0.7,length=15),1))

# Create grid layout:
L.3 <- matrix(1:30,nrow=2)

# Different esize:
qgraph(dat.3,layout=L.3,directed=FALSE,edge.labels=TRUE,esize=14)

# Different esize, strongest edges omitted (note how 0.4 edge is now 
# just as wide as 0.7 edge in previous graph):
qgraph(dat.3[-c(1:3,13:15),],layout=L.3,nNodes=30,directed=FALSE,
	edge.labels=TRUE,esize=14)

# Different esize, with maximum:
qgraph(dat.3,layout=L.3,directed=FALSE,edge.labels=TRUE,esize=14,maximum=1)
title("maximum=1",line=2.5)

qgraph(dat.3[-c(1:3,13:15),],layout=L.3,nNodes=30,directed=FALSE,edge.labels=TRUE,
	esize=14,maximum=1)
title("maximum=1",line=2.5)

# Different minimum
qgraph(dat.3,layout=L.3,directed=FALSE,edge.labels=TRUE,esize=14,minimum=0.1)
title("minimum=0.1",line=2.5)

# With cutoff score:
qgraph(dat.3,layout=L.3,directed=FALSE,edge.labels=TRUE,esize=14,cut=0.4)
title("cut=0.4",line=2.5)

# With details:
qgraph(dat.3,layout=L.3,directed=FALSE,edge.labels=TRUE,esize=14,minimum=0.1,maximum=1,
	cut=0.4,details=TRUE)
title("details=TRUE",line=2.5)

# Trivial example of manually specifying edge color and widths:
E <- as.matrix(data.frame(from=rep(1:3,each=3),to=rep(1:3,3),width=1:9))
qgraph(E,mode="direct",edge.color=rainbow(9))

### STRUCTURAL EQUATION MODELLING ###
library('sem')

### This example is taken from the examples of the sem function. 
### Only names were changed to better suit the path diagram.

# ----------------------- Thurstone data ---------------------------------------
#  Second-order confirmatory factor analysis, from the SAS manual for PROC CALIS

R.thur <- readMoments(diag=FALSE, names=c('Sen','Voc',
        'SC','FL','4LW','Suf',
        'LS','Ped', 'LG'))
    .828                                              
    .776   .779                                        
    .439   .493    .46                                 
    .432   .464    .425   .674                           
    .447   .489    .443   .59    .541                    
    .447   .432    .401   .381    .402   .288              
    .541   .537    .534   .35    .367   .32   .555        
    .38   .358    .359   .424    .446   .325   .598   .452  
            
model.thur <- specifyModel()
    F1 -> Sen,               *l11, NA
    F1 -> Voc,               *l21, NA
    F1 -> SC,                *l31, NA
    F2 -> FL,                *l41, NA
    F2 -> 4LW,               *l52, NA
    F2 -> Suf,               *l62, NA
    F3 -> LS,                *l73, NA
    F3 -> Ped,               *l83, NA
    F3 -> LG,                *l93, NA
    F4 -> F1,                *g1,  NA
    F4 -> F2,                *g2,  NA
    F4 -> F3,                *g3,  NA 
    Sen <-> Sen,             q*1,   NA
    Voc<-> Voc,              q*2,   NA
    SC <-> SC,               q*3,   NA
    FL <-> FL,               q*4,   NA
    4LW <-> 4LW,             q*5,   NA
    Suf<-> Suf,              q*6,   NA
    LS <-> LS,               q*7,   NA
    Ped<-> Ped,              q*8,   NA
    LG <-> LG,               q*9,   NA
    F1 <-> F1,               NA,     1
    F2 <-> F2,               NA,     1
    F3 <-> F3,               NA,     1
    F4 <-> F4,               NA,     1

require('sem')

### This example is taken from the examples of the sem function. 
### Only names were changed to better suit the path diagram.

# ----------------------- Thurstone data ---------------------------------------
#  Second-order confirmatory factor analysis, from the SAS manual for PROC CALIS

R.thur <- structure(c(1, 0.828, 0.776, 0.439, 0.432, 0.447, 0.447, 0.541, 
0.38, 0, 1, 0.779, 0.493, 0.464, 0.489, 0.432, 0.537, 0.358, 
0, 0, 1, 0.46, 0.425, 0.443, 0.401, 0.534, 0.359, 0, 0, 0, 1, 
0.674, 0.59, 0.381, 0.35, 0.424, 0, 0, 0, 0, 1, 0.541, 0.402, 
0.367, 0.446, 0, 0, 0, 0, 0, 1, 0.288, 0.32, 0.325, 0, 0, 0, 
0, 0, 0, 1, 0.555, 0.598, 0, 0, 0, 0, 0, 0, 0, 1, 0.452, 0, 0, 
0, 0, 0, 0, 0, 0, 1), .Dim = c(9L, 9L), .Dimnames = list(c("Sen", 
"Voc", "SC", "FL", "4LW", "Suf", "LS", "Ped", "LG"), c("Sen", 
"Voc", "SC", "FL", "4LW", "Suf", "LS", "Ped", "LG")))
            
model.thur <- structure(c("F1 -> Sen", "F1 -> Voc", "F1 -> SC", "F2 -> FL", 
"F2 -> 4LW", "F2 -> Suf", "F3 -> LS", "F3 -> Ped", "F3 -> LG", 
"F4 -> F1", "F4 -> F2", "F4 -> F3", "Sen <-> Sen", "Voc<-> Voc", 
"SC <-> SC", "FL <-> FL", "4LW <-> 4LW", "Suf<-> Suf", "LS <-> LS", 
"Ped<-> Ped", "LG <-> LG", "F1 <-> F1", "F2 <-> F2", "F3 <-> F3", 
"F4 <-> F4", "*l11", "*l21", "*l31", "*l41", "*l52", "*l62", 
"*l73", "*l83", "*l93", "*g1", "*g2", "*g3", "q*1", "q*2", "q*3", 
"q*4", "q*5", "q*6", "q*7", "q*8", "q*9", NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, "1", "1", "1", "1"), .Dim = c(25L, 3L), class = "semmod")

# Path diagram of the model:
qgraph(model.thur)
qgraph(model.thur,layout="tree",manifest=c('Sen','Voc','SC','FL','4LW','Suf','LS','Ped','LG'))

# Estimate and plot parameters:
sem.thur <- sem(model.thur, R.thur, 213)
qgraph(sem.thur,layout="tree",curve=0.4)

### pcalg support ###
# Example from pcalg vignette:
library("pcalg")
data(gmI)
suffStat <- list(C = cor(gmI$x), n = nrow(gmI$x))
pc.fit <- pc(suffStat, indepTest=gaussCItest,
p = ncol(gmI$x), alpha = 0.01)

qgraph(pc.fit)

Run the code above in your browser using DataLab