Learn R Programming

chords (version 0.67)

estimate.rds: Degree distribution estimation in respondent driven samples

Description

ML estimation of population size and degree distribution in respondent driven samples.

Usage

estimate.rds(sampled.degree.vector, Sij, method="BFGS", initial.values, arc=FALSE, control=generate.rds.control(), all.solutions=FALSE)
     
    var.theta(sampled.degree.vector, Sij, Njs, theta, beta)
    
    generate.rds.control(maxit=2000)
    
    estimate.rds.two.stage(sampled.degree.vector, Sij, method="BFGS", initial.values, arc=FALSE, control=generate.rds.control(), all.solutions=FALSE)

Arguments

sampled.degree.vector
An integer vector of the degree of each individual sampled (including zeroes).
Sij
An integer matrix of the counts of individuals sampled with rank i up to sampling period j. Row names are required and represent degrees.
method
Optimization method passed to optim.
initial.values
List of initialization values. See details and examples.
arc
Deprecated.
control
List of control parameters. Generated by generate.rds.control.
theta
Numeric value of $\theta$.
Njs
Numeric vector of fhe frequecies of degree $j$.
beta
Numeric value of $\beta$
maxit
Used by generate.rds.control to control the behavious of optim.
all.solutions
Should all local maxima be returned or just one.

Value

  • estimate.rds returns a list with the estimates for each starting values. Each components includes:
  • betaSee in details
  • thetaSee in details
  • initial.valuesValues used to initiate the optimization.
  • NjSee in details
  • iterationsThe number of optim iterations until convergence.
  • likelihood.optimumThe (log) likelhood at convergence.
  • callThe function call.
  • var.theta returns an estimate of the asymptotic variance of $\theta$.

    estimate.rds.two.stage will not return $\theta$ and $\beta$ but rather a vector of $\beta_j$ for all observed degrees.

    generate.rds.control returns a list of control values needed by estimate.rds. At present this includes only the number of optim iterations.

Details

estimate.rds performs maximum likelihood estimation of the population size, degree distribution and "coefficient of discoverability" as described in the reference. The method assumes the probability of sampling an individual with degree i at time j is given by: $$\beta I[j] (N[i]-S[i,j]) i^\theta$$ where $\beta$ is the base rate of sampling, $\theta$ is the "coefficient of discoverability", $I[j]$ it the number of individuals sampled up to time $j$, $N[i]$ is the number of individuals with degree $i$ in the population, and $S[i,j]$ is the number of individuals with degree $i$ sampled up to time $j$. The function estimate.rds has to be given some initialization values. By default, it is given a grid of $\theta$ values, but it can also be given a list including other parameter values (as in the example). estimate.rds.two.stage works like estimate.rds but does not assume $\beta_j=\beta * j^ \theta$, so tries to estimate $\beta_j$ directly. It does so in two stages: in the first it calls estimate.rds, and once the optimal values have been found, it relaxes the $\beta_j=\beta * j^ \theta$ assumptions and optimizes for $\beta_j)$

References

Respondent Driven Sampling as an Epidemic Process Berchenko Y., Rosenblatt J.D., White R.G.,Frost S.D.W. (Submitted)

Examples

Run this code
#### Commented due to long execution time.####

#### Estimating assuming beta_j=beta * j^theta:
data(simulation)
temp.data<- unlist(data3[1,7000:7500])
(rds.result<- estimate.rds(sampled.degree.vector=temp.data , Sij=make.Sij(temp.data), initial.values=list(theta=c(1,2))))
plot(rds.result$Nj, type='h', xlab='Degree', ylab=expression(N[j]), main='Estimated Degree Distribution')	
var.theta(temp.data, Njs=rds.result$Nj, theta=rds.result$theta, beta=rds.result$beta)

#### Example of the two-stage estimation of beta_j ####
estimation3<- estimate.rds.two.stage(sampled.degree.vector=temp.data, Sij = make.Sij(temp.data), 
											initial.values=list(theta=c(0.5,1,1.5)),
											control=generate.rds.control(maxit=5))
plot(estimation3$beta_js~estimation3$observed_js)
lines(lowess(estimation3$beta_js~estimation3$observed_js), col='red')


### Example using a full set of initialization values:
Njs<- c(100,100,500,1000); names(Njs)<- c("1","50","100","1000"); Njs<- as.table(Njs)
theta<- 1
beta<- 1e-10
tail(degree.sampled.vec<- generate.sample(theta, Njs, beta, sample.length=1e5))
str(rds.result<- estimate.rds(degree.sampled.vec, Sij = make.Sij(degree.sampled.vec), 				
				initial.values = list(theta=c(theta), beta=c(beta), Njs=list(Njs))))

Run the code above in your browser using DataLab