Learn R Programming

svytest (version 1.1.0)

svytestCE: Subset of 2015 Consumer Expenditure (CE) Dataset

Description

A curated subset of rows and columns from the Consumer Expenditure (CE) dataset that is provided in the [rpms](https://CRAN.R-project.org/package=rpms) package by Daniell Toth. This example dataset is designed for demonstration purposes within this package. Please reframe from using this dataset for inferential purposes.

Usage

svytestCE

Arguments

Format

A data frame with _n_ rows and _m_ variables:

NEWID

Consumer unit identifying variable, constructed using the first seven digits of a unique identifier.

CID

Cluster Identifier for all clusters (constructed using PSU, REGION, STATE, and POPSIZE).

QINTRVMO

Month for which the data were collected.

FINLWT21

Final sample weight used to make population inferences.

STATE

State FIPS code indicating the location of the consumer unit.

REGION

Region code: 1 = Northeast, 2 = Midwest, 3 = South, 4 = West.

BLS_URBN

Indicator of urban (1) versus rural (2) residence status.

POPSIZE

Population size class of the PSU, ranging from 1 (largest) to 5 (smallest).

CUTENURE

Housing tenure: 1 = Owned with mortgage; 2 = Owned without mortgage; 3 = Owned (mortgage not reported); 4 = Rented; 5 = Occupied without cash rent; 6 = Student housing.

ROOMSQ

Number of rooms (including finished living areas but excluding bathrooms).

BATHRMQ

Number of bathrooms in the consumer unit.

BEDROOMQ

Number of bedrooms in the consumer unit.

VEHQ

Number of owned vehicles.

FAM_TYPE

Household type based on the relationship of members to the reference person; for example, 1 = Married Couple only, 2 = Married Couple with children (oldest < 6 years), 3 = Married Couple with children (oldest 6-17 years), etc.

FAM_SIZE

Number of members in the consumer unit (family size).

PERSLT18

Count of persons less than 18 years old in the consumer unit.

PERSOT64

Count of persons older than 64 years in the consumer unit.

NO_EARNR

Number of earners in the consumer unit.

AGE

Age of the primary earner.

EDUCA

Education level of the primary earner, coded as 1 = None, 2 = 1st-8th Grade, 3 = Some high school, 4 = High school, 5 = Some college, 6 = AA degree, 7 = Bachelor's degree, 8 = Advanced degree.

SEX

Gender of the primary earner (F = Female, M = Male).

MARITAL

Marital status of the primary earner (1 = Married, 2 = Widowed, 3 = Divorced, 4 = Separated, 5 = Never Married).

MEMBRACE

Race of the primary earner (e.g., 1 = White, 2 = Black, 3 = Native American, 4 = Asian, 5 = Pacific Islander, 6 = Multi-race).

HORIGIN

Indicator of Hispanic, Latino, or Spanish origin (Y for yes, N for no).

ARM_FORC

Indicator if the primary earner is a member of the armed forces (Y/N).

IN_COLL

Current college enrollment status for the primary earner (Full for full time, Part for part time, No for not enrolled).

EARNTYPE

Type of employment for the primary earner: 1 = Full time all year, 2 = Part time all year, 3 = Full time part-year, 4 = Part time part-year.

OCCUCODE

Occupational code representing the primary job of the earner.

INCOMEY

Type of employment: coded as 1 = Employee of a private company, 2 = Federal government employee, 3 = State government employee, 4 = Local government employee, 5 = Self-employed, 6 = Working without pay in a family business.

FINCBTAX

Amount of consumer unit income before taxes in the past 12 months.

SALARYX

Wage or salary income received in the past 12 months, before deductions.

SOCRRX

Income received from Social Security and Railroad Retirement in the past 12 months.

TOTEXPCQ

Total expenditures reported for the current quarter.

TOTXEST

Total taxes paid (estimated) in the current period.

EHOUSNGC

Total expenditures for housing in the current quarter.

HEALTHCQ

Expenditures for health care during the current quarter.

FOODCQ

Expenditures on food during the current quarter.

Details

This example dataset is a subset extracted from the complete CE dataset used by the rpms package. It is intended to illustrate how to work with survey data in the context of recursive partitioning. The original CE data contain 68,415 observations on 47 variables; this example contains a smaller selection for ease of demonstration. The curated subset of the dataset removed several columns were removed for mostly missing data, redundant data, or not relevant to the examples. Rows were filtered for strictly-positive salary, expenditure, and tax variable values. Weights were not recalibrated following the changes.

See Also

rpms for an overview of the functions provided in the original package.