pima

Sources:
(a) Original owners: National Institute of Diabetes and Digestive and Kidney Diseases
(b) Donor of database: Vincent Sigillito <a href="/link/vgs%40aplcen.apl.jhu.edu?package=subgroup.discovery&version=0.3.1" data-mini-rdoc="subgroup.discovery::vgs@aplcen.apl.jhu.edu">vgs@aplcen.apl.jhu.edu</a>
 Research Center, RMI Group Leader
 Applied Physics Laboratory
 The Johns Hopkins University
 Johns Hopkins Road
 Laurel, MD 20707
 (301) 953-6231
(c) Date received: 9 May 1990

datasets

Developed to assist in discovering interesting subgroups in high-dimensional data.
The PRIM implementation is based on the 1998 paper "Bump hunting in high-dimensional data" by Jerome H. Friedman and Nicholas I. Fisher <doi:10.1023/A:1008894516817>.
PRIM involves finding a set of "rules" which combined imply unusually large values of some other target variable.
Specifically one tries to find a set of sub regions in which the target variable is substantially larger than overall mean.
The objective of bump hunting in general is to find regions in the input (attribute/feature) space with relatively high values for the target variable.
The regions are described by simple rules of the type if: condition-1 and ... and condition-n then: estimated target value. Given the data (or a subset of the data),
the goal is to produce a box B within which the target mean is as large as possible.

Jurian Baas

subgroup.discovery

Subgroup Discovery and Bump Hunting

Ad Feelders

pima function

A data frame with 768 rows and 9 variables:<dl class="dl-horizontal">
 <dt>pregnant</dt><dd>Number of times pregnant</dd>
 <dt>glucose</dt><dd>Plasma glucose concentration a 2 hours in an oral glucose tolerance test</dd>
 <dt>bp</dt><dd>Diastolic blood pressure (mm Hg)</dd>
 <dt>skin_thickness</dt><dd>Triceps skin fold thickness (mm)</dd>
 <dt>insulin</dt><dd>2-Hour serum insulin (mu U/ml)</dd>
 <dt>bmi</dt><dd>Body mass index (weight in kg/(height in m)^2)</dd>
 <dt>diabetes</dt><dd>Diabetes pedigree function</dd>
 <dt>age</dt><dd>Age (years)</dd>
 <dt>class</dt><dd>Class variable (0 or 1)</dd>
</dl>

Format

Sources:
(a) Original owners: National Institute of Diabetes and Digestive and Kidney Diseases
(b) Donor of database: Vincent Sigillito <a href='mailto:vgs@aplcen.apl.jhu.edu'>vgs@aplcen.apl.jhu.edu</a>
 Research Center, RMI Group Leader
 Applied Physics Laboratory
 The Johns Hopkins University
 Johns Hopkins Road
 Laurel, MD 20707
 (301) 953-6231
(c) Date received: 9 May 1990

Pima Indians Diabetes Database. — pima

A data frame with 768 rows and 9 variables:<dl class='dl-horizontal'>
 <dt>pregnant</dt><dd>Number of times pregnant</dd>
 <dt>glucose</dt><dd>Plasma glucose concentration a 2 hours in an oral glucose tolerance test</dd>
 <dt>bp</dt><dd>Diastolic blood pressure (mm Hg)</dd>
 <dt>skin_thickness</dt><dd>Triceps skin fold thickness (mm)</dd>
 <dt>insulin</dt><dd>2-Hour serum insulin (mu U/ml)</dd>
 <dt>bmi</dt><dd>Body mass index (weight in kg/(height in m)^2)</dd>
 <dt>diabetes</dt><dd>Diabetes pedigree function</dd>
 <dt>age</dt><dd>Age (years)</dd>
 <dt>class</dt><dd>Class variable (0 or 1)</dd>
</dl>

pima: Pima Indians Diabetes Database.

Description

Usage

Arguments

Format

Details