init_multi_data
creates the labeled and unlabeled datasets for the
categorical and ordinal case.
init_multi_data(train_id, train, init_N, type)
A numeric vector denotes the id of the all training samples. Each sample corresponds to a unique identification from 1 to the length of all the samples.
A numeric matrix denote the training datasets. The length of the train's row is the number of the training samples and the first column represents the labels and the rest columns are the explanatory variables. Note that the id of the sample in the train dataset is the same as the train_id.
A numeric value that determine the number of the initial labeled samples. Note that it shouldn't be too large or too small.
A character string that determines which type of data will be generated, matching one of 'ord' or 'cat'.
a list containing the following components
a list containing the datasets which we will use
the initial labeled datasets. The number of the datasets is specified by the init_N
the value of the labels from 0 to K which denotes the number of categories
the unique id of the initial labeled dataset
the unique id of the unlabeled dataset
the all training samples which is composed of the samples corresponding to labeled_ids and samples corresponding to unlabeled_ids
init_multi_data generates the initial labeled dataset and the unlabeled datasets which we will select a most informative sample from the unlabeled datasets into the labeled dataset. The number of samples in the initial labeled datasets is specified the init_N argument. The value of 'type' should be'ord' or 'cat'. If it equals to 'ord', the element of the splitted will be composed of samples from Classes K and Classes K+1. Otherwise, the element of the splitted will be composed of samples from Classes 0 and Classes K.
# NOT RUN {
## For an example, see example(seq_ord_model)
# }
Run the code above in your browser using DataLab