Wage
Mid-Atlantic Wage Data
Wage and other data for a group of 3000 male workers in the Mid-Atlantic region.
- Keywords
- datasets
Usage
Wage
Format
A data frame with 3000 observations on the following 11 variables.
year
Year that wage information was recorded
age
Age of worker
maritl
A factor with levels
1. Never Married
2. Married
3. Widowed
4. Divorced
and5. Separated
indicating marital statusrace
A factor with levels
1. White
2. Black
3. Asian
and4. Other
indicating raceeducation
A factor with levels
1. < HS Grad
2. HS Grad
3. Some College
4. College Grad
and5. Advanced Degree
indicating education levelregion
Region of the country (mid-atlantic only)
jobclass
A factor with levels
1. Industrial
and2. Information
indicating type of jobhealth
A factor with levels
1. <=Good
and2. >=Very Good
indicating health level of workerhealth_ins
A factor with levels
1. Yes
and2. No
indicating whether worker has health insurancelogwage
Log of workers wage
wage
Workers raw wage
References
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013) An Introduction to Statistical Learning with applications in R, www.StatLearning.com, Springer-Verlag, New York
Examples
# NOT RUN {
summary(Wage)
lm(wage~year+age,data=Wage)
## maybe str(Wage) ; plot(Wage) ...
# }
Community examples
# Take a look at the Wage dataset, using dim(), str() and summary(). dim(Wage) str(Wage) summary(Wage) # Fit a linear regression model that explains the wage according to the age. Use the lm() function such that wage is a function of age. lm_wage <- lm(wage ~ age, data = Wage) # Define a data.frame for new and unseen samples new <- data.frame(age = c(40, 60, 70)) # Predict the wage for a 40, 60 and 70-year old workers wage_pred <- predict(lm_wage, new) # Plot old data and predictions plot(Wage$wage ~ Wage$age) points(c(40, 60, 70), wage_pred, col = "red", pch=20, cex=2)