netlm: Linear Regression for Network Data

Description

netlm regresses the network variable in y on the network variables in stack x using ordinary least squares. The resulting fits (and coefficients) are then tested against the indicated null hypothesis.

Usage

netlm(y, x, intercept=TRUE, mode="digraph", diag=FALSE,
    nullhyp=c("qap", "qapspp", "qapy", "qapx", "qapallx", 
    "cugtie", "cugden", "cuguman", "classical"), 
    test.statistic = c("t-value", "beta"), tol=1e-7,
    reps=1000)

Arguments

dependent network variable. This should be a matrix, for obvious reasons; NAs are allowed, but dichotomous data is strongly discouraged due to the assumptions of the analysis.

stack of independent network variables. Note that NAs are permitted, as is dichotomous data.

intercept

logical; should an intercept term be added?

mode

string indicating the type of graph being evaluated. "digraph" indicates that edges should be interpreted as directed; "graph" indicates that edges are undirected. mode is set to "digraph" by default.

diag

logical; should the diagonal be treated as valid data? Set this true if and only if the data can contain loops. diag is FALSE by default.

nullhyp

string indicating the particular null hypothesis against which to test the observed estimands.

test.statistic

string indicating the test statistic to be used for the Monte Carlo procedures.

tol

tolerance parameter for qr.solve.

reps

integer indicating the number of draws to use for quantile estimation. (Relevant to the null hypothesis test only - the analysis itself is unaffected by this parameter.) Note that, as for all Monte Carlo procedures, convergence is slower for more extreme quantiles. By default, reps=1000.

Value

An object of class netlm

Details

netlm performs an OLS linear network regression of the graph y on the graphs in x. Network regression using OLS is directly analogous to standard OLS regression elementwise on the appropriately vectorized adjacency matrices of the networks involved. In particular, the network regression attempts to fit the model:

$$\mathbf{A_y} = b_0 \mathbf{A_1} + b_1 \mathbf{A_{x_1}} + b_2 \mathbf{A_{x_2}} + \dots + \mathbf{Z}$$

where $\mathbf{A_y}$ is the dependent adjacency matrix, $\mathbf{A_{x_i}}$ is the ith independent adjacency matrix, $\mathbf{A_1}$ is an n x n matrix of 1's, and $\mathbf{Z}$ is an n x n matrix of independent normal random variables with mean 0 and variance $\sigma^2$. Clearly, this model is nonoptimal when $\mathbf{A_y}$ is dichotomous (or, for that matter, categorical in general); an alternative such as netlogit should be employed in such cases. (Note that netlm will still attempt to fit such data...the user should consider him or herself to have been warned.)

Because of the frequent presence of row/column/block autocorrelation in network data, classical hull hypothesis tests (and associated standard errors) are generally suspect. Further, it is sometimes of interest to compare fitted parameter values to those arising from various baseline models (e.g., uniform random graphs conditional on certain observed statistics). The tests supported by netlm are as follows:

classical: tests based on classical asymptotics.
cug: conditional uniform graph test (see cugtest) controlling for order.
cugden: conditional uniform graph test, controlling for order and density.
cugtie: conditional uniform graph test, controlling for order and tie distribution.
qap: QAP permutation test (see qaptest); currently identical to qapspp.
qapallx: QAP permutation test, using independent x-permutations.
qapspp: QAP permutation test, using Dekker's "semi-partialling plus" procedure.
qapx: QAP permutation test, using (single) x-permutations.
qapy: QAP permutation test, using y-permutations.

The statistic to be employed in the above tests may be selected via test.statistic. By default, the $t$-statistic (rather than estimated coefficient) is used, as this is more approximately pivotal; coefficient-based tests are not recommended for QAP null hypotheses, although they are provided here for legacy purposes.

Note that interpretation of quantiles for single coefficients can be complex in the presence of multicollinearity or third variable effects. qapspp is generally recommended for most multivariable analyses, as it is known to be fairly robust to these conditions. Reasonable printing and summarizing of netlm objects is provided by print.netlm and summary.netlm, respectively. No plot methods exist at this time, alas.

References

Dekker, D.; Krackhardt, D.; Snijders, T.A.B. (2007). “Sensitivity of MRQAP Tests to Collinearity and Autocorrelation Conditions.” Psychometrika, 72(4), 563-581.

Dekker, D.; Krackhardt, D.; Snijders, T.A.B. (2003). “Mulicollinearity Robust QAP for Multiple Regression.” CASOS Working Paper, Carnegie Mellon University.

Krackhardt, D. (1987). “QAP Partialling as a Test of Spuriousness.” Social Networks, 9 171-186.

Krackhardt, D. (1988). “Predicting With Networks: Nonparametric Multiple Regression Analyses of Dyadic Data.” Social Networks, 10, 359-382.

Examples

Run this code

# NOT RUN {
#Create some input graphs
x<-rgraph(20,4)

#Create a response structure
y<-x[1,,]+4*x[2,,]+2*x[3,,]   #Note that the fourth graph is unrelated

#Fit a netlm model
nl<-netlm(y,x,reps=100)

#Examine the results
summary(nl)
# }

Run the code above in your browser using DataLab