Learn R Programming

NHANES (version 2.0)

NHANES: NHANES 2009-2012 with adjusted weighting

Description

This is survey data collected by the US National Center for Health Statistics (NCHS) which has conducted a series of health and nutrition surveys since the early 1960's. Since 1999 approximately 5,000 individuals of all ages are interviewed in their homes every year and complete the health examination component of the survey. The health examination is conducted in a mobile examination centre (MEC).

Usage

data(NHANES)

Arguments

format

data frames with raw and resampled versions of the NHANES data. See below for details and descriptions of the varaibles.

source

These data were originally assembled by Michelle Dalrymple of Cashmere High School and Chris Wild of the University of Auckland, New Zealand for use in teaching statistics.

NHANES warning

The following warning comes directly from the NHANES web site:

For NHANES datasets, the use of sampling weights and sample design variables is recommended for all analyses because the sample design is a clustered design and incorporates differential probabilities of selection. If you fail to account for the sampling parameters, you may obtain biased estimates and overstate significance levels.

Disclamer

Please note that the data sets provided in this package are derived from the NHANES database and have been adapted for educational purposes. As such, they are NOT suitable for use as a research database. For research purposes you should download original data files from the NHANES website and follow the analysis instructions given there. Further details and relevant documentation can be found on the following NHANES websites
  • http://www.cdc.gov/nchs/nhanes.htm,
  • http://wwwn.cdc.gov/nchs/nhanes/search/nhanes11_12.aspx, and
  • http://wwwn.cdc.gov/nchs/nhanes/search/nhanes09_10.aspx.

Details

The NHANES target population is "the non-institutionalized civilian resident population of the United States". NHANES, (American National Health and Nutrition Examination surveys), use complex survey designs (see http://www.cdc.gov/nchs/data/series/sr_02/sr02_162.pdf) that oversample certain subpopulations like racial minorities. Naive analysis of the original NHANES data can lead to mistaken conclusions. The percentages of people from each racial group in the data, for example, are quite different from the way they are in the population.

NHANES and NHANESraw each include 75 variables available for the 2009-2010 and 2011-2012 sample years. NHANESraw has 20,293 observations of these variables plus four additional variables that describe that sample weighting scheme employed. NHANES contains 10,000 rows of data resampled from NHANESraw to undo these oversampling effects. NHANES can be treated, for educational purposes, as if it were a simple random sample from the American population.

A list of the variables in the data set follows appears below along with variable descriptions and links to the original NHANES documentation.

Examples

Run this code
# Due to the sampling design, some races were over/under-sampled.
rbind(
  NHANES = table(NHANES$Race1) / nrow(NHANES),
  NHANESraw = table(NHANESraw$Race1) / nrow(NHANESraw),
  diff = (table(NHANES$Race1) - table(NHANESraw$Race1)) / nrow(NHANESraw)
)

Run the code above in your browser using DataLab