This dataset, boston_pts, is a data frame containing information on housing values
and neighborhood characteristics in the Boston area. It is based on the classic dataset
by Harrison and Rubinfeld (1978), corrected for minor errors and augmented with the latitude
and longitude of the observations. Gilley and Pace also note that the MEDV variable
is censored, with values at or over USD 50,000 set to USD 50,000.
data(boston_pts)A data frame with 506 observations and 20 variables:
Town name (factor with 92 levels)
Town number (integer)
Census tract number (integer)
Longitude (numeric)
Latitude (numeric)
Median value of owner-occupied homes in USD 1,000s (numeric, censored at 50)
Corrected median value of owner-occupied homes (numeric)
Per capita crime rate by town (numeric)
Proportion of residential land zoned for lots over 25,000 sq.ft. (numeric)
Proportion of non-retail business acres per town (numeric)
Charles River dummy variable (factor: "0" = not bounded, "1" = bounded)
Nitric oxides concentration (parts per 10 million, numeric)
Average number of rooms per dwelling (numeric)
Proportion of owner-occupied units built prior to 1940 (numeric)
Weighted distances to five Boston employment centers (numeric)
Index of accessibility to radial highways (integer)
Full-value property-tax rate per $10,000 (integer)
Pupil-teacher ratio by town (numeric)
Proportion of Black residents, defined as 1000(Bk - 0.63)^2 (numeric)
Percentage of lower status of the population (numeric)
The dataset consists of 506 observations and 20 variables, including socio-economic,
environmental, and housing characteristics. Geographic coordinates (longitude and latitude)
are provided for spatial analysis. Related data objects include boston.utm, a matrix
of tract point coordinates projected to UTM zone 19, and boston.soi, a sphere of
influence neighbors list.
The dataset name has been kept as boston_pts to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
lightsf package and assists users in identifying its specific characteristics.
The suffix pts indicates that the dataset includes spatial point information.
The original content has not been modified in any way.