SARTML: Estimate SART model

Description

Estimate SART model

Usage

SARTML(
  formula,
  contextual,
  Glist,
  theta0 = NULL,
  optimizer = "optim",
  opt.ctr = list(),
  print = TRUE,
  cov = TRUE,
  data
)

Arguments

formula

an object of class formula: a symbolic description of the model. The formula should be as for example y ~ x1 + x2 | x1 + x2 where y is the endogenous vector, the listed variables before the pipe, x1, x2 are the individual exogenous variables and the listed variables after the pipe, x1, x2 are the contextual observable variables. Other formulas may be y ~ x1 + x2 for the model without contextual effects, y ~ -1 + x1 + x2 | x1 + x2 for the model without intercept or y ~ x1 + x2 | x2 + x3 to allow the contextual variable to be different from the individual variables.

contextual

(optional) logical; if true, this means that all individual variables will be set as contextual variables. Set the formula as y ~ x1 + x2 and contextual as TRUE is equivalent to set the formula as y ~ x1 + x2 | x1 + x2.

Glist

the adjacency matrix or list sub-adjacency matrix.

theta0

(optional) starting value of $\theta = (\lambda, \beta, \gamma, \sigma)$. The parameter $\gamma$ should be removed if the model does not contain contextual effects (see details).

optimizer

is either nlm (referring to the function nlm) or optim (referring to the function optim). Other arguments of these functions such as, the control values and the method can be defined through the argument opt.ctr.

opt.ctr

list of arguments of nlm or optim (the one set in optimizer) such as control, method, ...

a boolean indicating if the estimate should be printed at each step.

cov

a boolean indicating if the covariance should be computed.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which SARTML is called.

Value

A list consisting of:

number of sub-networks.

number of individuals in each network.

estimate

Maximum Likelihood (ML) estimator.

likelihood

likelihood value.

cov

covariance matrix of the estimate.

optimization

output as returned by the optimizer.

codedata

list of formula, name of the object Glist, number of friends in the network, name of the object data, and number of zeros in y (see details).

Details

Model

The left-censored variable $\mathbf{y}$ is generated from a latent variable $\mathbf{y}^*$. The latent variable is given for all i as $$y_i^* = \lambda \mathbf{g}_i y + \mathbf{x}_i'\beta + \mathbf{g}_i\mathbf{X}\gamma + \epsilon_i,$$ where $\epsilon_i \sim N(0, \sigma^2)$. The count variable $y_i$ is then define that is $y_i = 0$ if $y_i^* \leq 0$ and $y_i = y_i^*$ otherwise.

`codedata`

The class of the output of this function is SARTML. This class has a summary and print methods to summarize and print the results. The adjacency matrix is needed to summarize the results. However, in order to save memory, the function does not return it. Instead, it returns codedata which contains the formula and the name of the adjacency matrix passed through the argument Glist. codedata will be used to get access to the adjacency matrix. Therefore, it is important to have the adjacency matrix available in .GlobalEnv. Otherwise it will be necessary to provide the adjacency matrix to the summary and print functions.

Examples

Run this code

# NOT RUN {
# Groups' size
M      <- 5 # Number of sub-groups
nvec   <- round(runif(M, 100, 1000))
n      <- sum(nvec)

# Parameters
lambda <- 0.4
beta   <- c(2, -1.9, 0.8)
gamma  <- c(1.5, -1.2)
sigma  <- 1.5
theta  <- c(lambda, beta, gamma, sigma)

# X
X      <- cbind(rnorm(n, 1, 1), rexp(n, 0.4))

# Network
Glist  <- list()

for (m in 1:M) {
  nm           <- nvec[m]
  Gm           <- matrix(0, nm, nm)
  max_d        <- 30
  for (i in 1:nm) {
    tmp        <- sample((1:nm)[-i], sample(0:max_d, 1))
    Gm[i, tmp] <- 1
  }
  rs           <- rowSums(Gm); rs[rs == 0] <- 1
  Gm           <- Gm/rs
  Glist[[m]]   <- Gm
}


# data
data    <- data.frame(x1 = X[,1], x2 =  X[,2])

rm(list = ls()[!(ls() %in% c("Glist", "data", "theta"))])

ytmp    <- simTobitnet(formula = ~ x1 + x2 | x1 + x2, Glist = Glist,
                       theta = theta, data = data)

y       <- ytmp$y

# plot histogram
hist(y)

opt.ctr <- list(method  = "Nelder-Mead", 
                control = list(abstol = 1e-16, abstol = 1e-11, maxit = 5e3))
data    <- data.frame(yt = y, x1 = data$x1, x2 = data$x2)
rm(list = ls()[!(ls() %in% c("Glist", "data"))])

out     <- SARTML(formula = yt ~ x1 + x2, optimizer = "nlm",
                  contextual = TRUE, Glist = Glist, data = data)
summary(out)
# }

Run the code above in your browser using DataLab