estimate.rds: Degree distribution estimation in respondent driven samples

Description

ML estimation of population size and degree distribution in respondent driven samples.

Usage

estimate.rds(sampled.degree.vector, Sij, method="BFGS", initial.values, arc=FALSE, control=generate.rds.control(), all.solutions=FALSE)
     
    var.theta(sampled.degree.vector, Sij, Njs, theta, beta)
    
    generate.rds.control(maxit=2000)
    
    estimate.rds.two.stage(sampled.degree.vector, Sij, method="BFGS", initial.values, arc=FALSE, control=generate.rds.control(), all.solutions=FALSE)

Arguments

sampled.degree.vector

An integer vector of the degree of each individual sampled (including zeroes).

Sij

An integer matrix of the counts of individuals sampled with rank i up to sampling period j. Row names are required and represent degrees.

method

Optimization method passed to optim.

initial.values

List of initialization values. See details and examples.

arc

Deprecated.

control

List of control parameters. Generated by generate.rds.control.

theta

Numeric value of $\theta$.

Njs

Numeric vector of fhe frequecies of degree $j$.

beta

Numeric value of $\beta$

maxit

Used by generate.rds.control to control the behavious of optim.

all.solutions

Should all local maxima be returned or just one.

Value

estimate.rds returns a list with the estimates for each starting values. Each components includes:
betaSee in details
thetaSee in details
initial.valuesValues used to initiate the optimization.
NjSee in details
iterationsThe number of optim iterations until convergence.
likelihood.optimumThe (log) likelhood at convergence.
callThe function call.
var.theta returns an estimate of the asymptotic variance of $\theta$.
estimate.rds.two.stage will not return $\theta$ and $\beta$ but rather a vector of $\beta_j$ for all observed degrees.
generate.rds.control returns a list of control values needed by estimate.rds. At present this includes only the number of optim iterations.

Details

estimate.rds performs maximum likelihood estimation of the population size, degree distribution and "coefficient of discoverability" as described in the reference. The method assumes the probability of sampling an individual with degree i at time j is given by: $$\beta I[j] (N[i]-S[i,j]) i^\theta$$ where $\beta$ is the base rate of sampling, $\theta$ is the "coefficient of discoverability", $I[j]$ it the number of individuals sampled up to time $j$, $N[i]$ is the number of individuals with degree $i$ in the population, and $S[i,j]$ is the number of individuals with degree $i$ sampled up to time $j$. The function estimate.rds has to be given some initialization values. By default, it is given a grid of $\theta$ values, but it can also be given a list including other parameter values (as in the example). estimate.rds.two.stage works like estimate.rds but does not assume $\beta_j=\beta * j^ \theta$, so tries to estimate $\beta_j$ directly. It does so in two stages: in the first it calls estimate.rds, and once the optimal values have been found, it relaxes the $\beta_j=\beta * j^ \theta$ assumptions and optimizes for $\beta_j)$

References

Respondent Driven Sampling as an Epidemic Process Berchenko Y., Rosenblatt J.D., White R.G.,Frost S.D.W. (Submitted)

Examples

Run this code

#### Commented due to long execution time.####

#### Estimating assuming beta_j=beta * j^theta:
data(simulation)
temp.data<- unlist(data3[1,7000:7500])
(rds.result<- estimate.rds(sampled.degree.vector=temp.data , Sij=make.Sij(temp.data), initial.values=list(theta=c(1,2))))
plot(rds.result$Nj, type='h', xlab='Degree', ylab=expression(N[j]), main='Estimated Degree Distribution')	
var.theta(temp.data, Njs=rds.result$Nj, theta=rds.result$theta, beta=rds.result$beta)

#### Example of the two-stage estimation of beta_j ####
estimation3<- estimate.rds.two.stage(sampled.degree.vector=temp.data, Sij = make.Sij(temp.data), 
											initial.values=list(theta=c(0.5,1,1.5)),
											control=generate.rds.control(maxit=5))
plot(estimation3$beta_js~estimation3$observed_js)
lines(lowess(estimation3$beta_js~estimation3$observed_js), col='red')


### Example using a full set of initialization values:
Njs<- c(100,100,500,1000); names(Njs)<- c("1","50","100","1000"); Njs<- as.table(Njs)
theta<- 1
beta<- 1e-10
tail(degree.sampled.vec<- generate.sample(theta, Njs, beta, sample.length=1e5))
str(rds.result<- estimate.rds(degree.sampled.vec, Sij = make.Sij(degree.sampled.vec), 				
				initial.values = list(theta=c(theta), beta=c(beta), Njs=list(Njs))))

Run the code above in your browser using DataLab