This function checks whether an identifier column is consistent, i.e. appears it exists, there is only one, and there is no overlap with any user-provided feature columns, identifiers, or
.check_input_identifier_column(
id_column,
data,
signature = NULL,
exclude_features = NULL,
include_features = NULL,
other_id_column = NULL,
outcome_column = NULL,
col_type,
check_stringency = "strict"
)
Character string indicating the currently inspected identifier column.
Data set as loaded using the .load_data
function.
(optional) One or more names of feature columns that are considered part of a specific signature. Features specified here will always be used for modelling. Ranking from feature selection has no effect for these features.
(optional) Feature columns that will be removed
from the data set. Cannot overlap with features in signature
,
novelty_features
or include_features
.
(optional) Feature columns that are specifically
included in the data set. By default all features are included. Cannot
overlap with exclude_features
, but may overlap signature
. Features in
signature
and novelty_features
are always included. If both
exclude_features
and include_features
are provided, include_features
takes precedence, provided that there is no overlap between the two.
Character string indicating another identifier column.
Character string indicating the outcome column(s).
Character string indicating the type of column, i.e. sample
or batch
.
Specifies stringency of various checks. This is mostly:
strict
: default value used for summon_familiar
. Thoroughly checks
input data. Used internally for checking development data.
external_warn
: value used for extract_data
and related methods. Less
stringent checks, but will warn for possible issues. Used internally for
checking data for evaluation and explanation.
external
: value used for external methods such as predict
. Less
stringent checks, particularly for identifier and outcome columns, which may
be completely absent. Used internally for predict
.