Different hierarchical clusterings and k-means clusterings as well as a model-based clustering have been applied to several financial variables for a random sample of ten thousand observations.
data(CPScluster)
A data frame with 10000 observations on the following 39 variables.
Age
a numeric vector
Sex
a factor with levels female
male
Race
a factor with levels Black
White
Ethnic
a factor
Marital.Status
a factor
Kind.of.Family
a factor
Classical
a factor with levels All other
Classical Husband-Wife family
Family.Type
a factor
Number.of.Persons.in.Family
a numeric vector
Number.of.Kids
a numeric vector
Education.of.Head
a factor
Labor.Status
a factor
Class.of.Worker
a factor
Working.Hours
a numeric vector
Income.of.Head
a numeric vector
Family.Income
a numeric vector
Taxable.Income
a numeric vector
Federal.tax
a numeric vector
Family.sequence.number
a numeric vector
State
a factor
Division
a factor
Region
a factor with levels Midwest
North East
South
West
hc4
a numeric vector
hc6
a numeric vector
hc8
a numeric vector
hc12
a numeric vector
hcs4
a numeric vector
hcs6
a numeric vector
hcs8
a numeric vector
hcs12
a numeric vector
hcw4
a numeric vector
hcw6
a numeric vector
hcw8
a numeric vector
hcw12
a numeric vector
km4
a numeric vector
km6
a numeric vector
km8
a numeric vector
km12
a numeric vector
mc12
a numeric vector
# NOT RUN {
data(CPScluster)
## maybe str(CPScluster) ; plot(CPScluster) ...
# }
Run the code above in your browser using DataLab