Learn R Programming

SchoolDataIT (version 0.2.4)

Util_DB_MIUR_bool: Clean and convert the raw school buildings data to Boolean variables

Description

This function cleans the output of the Get_DB_MIUR function from missing values in two steps:

  • First, it deletes both the columns exceeding a threshold of missing values (1000 by default) and the columns that cannot be converted into Boolean variables

  • Then, it deletes the rows in which missing values remain

Finally, the remaining data are converted into Boolean variables. It is possible to keep track of the deleted rows.

Usage

Util_DB_MIUR_bool(
  data = NULL,
  cutout = NULL,
  col_cut_thresh = 10^3,
  verbose = TRUE,
  track_deleted = TRUE,
  autoAbort = autoAbort,
  ...
)

Value

If track_deleted == TRUE, An object of class list including two objects:

  • $data: object of class tbl_df, tbl and data.frame, the output dataframe. All variables besides the first 8 ones (which identify the record) are numeric.

  • $deleted: character. The school codes corresponding to deleted rows

If track_deleted == FALSE, the output is only the first element of the list.

Arguments

data

Object of class tbl_df, tbl and data.frame. Input data obtaned through the function Get_DB_MIUR. If NULL it will be downloaded automatically with the appropriate arguments, but not saved in the global environment. NULL by default.

cutout

Character. The columns to cut out. If NULL, it will be determined automatically. NULL by default.

col_cut_thresh

Numeric. The threshold of missing values allowed for each variable. If a variable as a higher number of missing observations, then it is cut out. 1000 by default.

verbose

Logical. If TRUE, the user keeps track of the main underlying operations. TRUE by default.

track_deleted

Logical. If TRUE, the function returns the names of the school not included in the output dataframe. TRUE by default.

autoAbort

Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. FALSE by default.