MoTBFs (version 1.2)

subsetData: Subset a Dataset

Description

Collection of functions for subsetting a dataset by entries, by variables, and for splitting it in a training dataset and in a test dataset.

Usage

TrainingandTestData(data, percentage_test, discreteVariables = NULL)

newData(data, nameX, nameY)

splitdata(data, nameVariable, min, max)

Arguments

data

A dataset of class "data.frame".

percentage_test

The percentage to be data test. Between 0 and 1.

discreteVariables

A "character" array with the name of the discrete variables.

nameX

The name of the child variable in the conditional method.

nameY

The name of the parent variables in the conditional method.

nameVariable

Name of the variable to filter.

min

The lower value of the interval for filter.

max

The higher value of the interval for filter.

Value

A list of datasets for TrainingandTestData() or a subset of the original dataset for the others two functions.

Examples

Run this code
# NOT RUN {
## Dataset
X <- rnorm(1000)
Y <- rchisq(1000, df = 8)
Z <- rep(letters[1:10], times = 1000/10)
data <- data.frame(X = X, Y = Y, Z = Z)
data <- discreteVariables_as.character(dataset = data, discreteVariables ="Z")

## Training and Test Datasets
TT <- TrainingandTestData(data, percentage_test = 0.2)
TT$Training
TT$Test

## Subset Dataset
newData(data, nameX = "X", nameY = "Z")
splitdata(data, nameVariable = "X", min = 2, max= 3)
# }

Run the code above in your browser using DataLab