Learn R Programming

epiR (version 0.9-69)

epi.clustersize: Sample size for cluster-sample surveys

Description

Estimates the number of clusters to be sampled using a cluster-sample design.

Usage

epi.clustersize(p, b, rho, epsilon.r, conf.level = 0.95)

Arguments

p
the estimated prevalence of the outcome in the population.
b
the number of units sampled per cluster.
rho
the intra-cluster correlation, a measure of the variation between clusters compared with the variation within clusters.
epsilon.r
scalar, the acceptable relative error.
conf.level
scalar, defining the level of confidence in the computed result.

Value

  • A list containing the following:
  • clustersthe estimated number of clusters to be sampled.
  • unitsthe total number of units to be sampled.
  • designthe design effect.

References

Bennett S, Woods T, Liyanage WM, Smith DL (1991). A simplified general method for cluster-sample surveys of health in developing countries. World Health Statistics Quarterly 44: 98 - 106. Otte J, Gumm I (1997). Intra-cluster correlation coefficients of 20 infections calculated from the results of cluster-sample surveys. Preventive Veterinary Medicine 31: 147 - 150.

Examples

Run this code
## EXAMPLE 1:
## The expected prevalence of disease in a population of cattle is 0.10.
## We wish to conduct a survey, sampling 50 animals per farm. No data  
## are available to provide an estimate of rho, though we suspect
## the intra-cluster correlation for this disease to be moderate.           
## We wish to be 95\% certain of being within 10\% of the true population
## prevalence of disease. How many herds should be sampled?

p <- 0.10; b <- 50; D <- 4
rho <- (D - 1) / (b - 1)
epi.clustersize(p = 0.10, b = 50, rho = rho, epsilon.r = 0.10, 
   conf.level = 0.95)

## We need to sample 278 herds (13900 samples in total).

## EXAMPLE 2 (from Bennett et al. 1991):
## A cross-sectional study is to be carried out to determine the prevalence
## of a given disease in a population using a two-stage cluster design. We 
## estimate prevalence to be 0.20 and we expect rho to be in the order of 0.02.
## We want to take sufficient samples to be 95\% certain that our estimate of 
## prevalence is within 5\% of the true population value (that is, a relative 
## error of 0.05 / 0.20 = 0.25). Assuming 20 responses from each cluster, 
## how many clusters do we need to be sample?

epi.clustersize(p = 0.20, b = 20, rho = 0.02, epsilon.r = 0.25, 
   conf.level = 0.95)

## We need to sample 18 clusters (360 samples in total).

Run the code above in your browser using DataLab