This function cleans the output of the Get_DB_MIUR
function from missing values in two steps:
First, it deletes both the columns exceeding a threshold of missing values (1000 by default) and the columns that cannot be converted into Boolean variables
Then, it deletes the rows in which missing values remain
Finally, the remaining data are converted into Boolean variables. It is possible to keep track of the deleted rows.
Util_DB_MIUR_bool(
data = NULL,
cutout = NULL,
col_cut_thresh = 10^3,
verbose = TRUE,
track_deleted = TRUE,
autoAbort = autoAbort,
...
)
If track_deleted == TRUE
, An object of class list
including two objects:
$data
: object of class tbl_df
, tbl
and data.frame
, the output dataframe. All variables besides the first 8 ones (which identify the record) are numeric.
$deleted
: character. The school codes corresponding to deleted rows
If track_deleted == FALSE
, the output is only the first element of the list.
Object of class tbl_df
, tbl
and data.frame
. Input data obtaned through the function Get_DB_MIUR
.
If NULL
it will be downloaded automatically with the appropriate arguments, but not saved in the global environment. NULL
by default.
Character. The columns to cut out. If NULL
, it will be determined automatically.
NULL
by default.
Numeric. The threshold of missing values allowed for each variable.
If a variable as a higher number of missing observations, then it is cut out. 1000
by default.
Logical. If TRUE
, the user keeps track of the main underlying operations. TRUE by default.
Logical. If TRUE
, the function returns the names of the school not included in the output dataframe. TRUE
by default.
Logical. In case any data must be retrieved, whether to automatically abort the operation and return NULL in case of missing internet connection or server response errors. FALSE
by default.