smotefamily (version 1.3.1)

sample_generator: The function to generate 2-dimensional dataset

Description

The function to generate 2-dimensional dataset given the number of instances and the ratio between the number of negative instances to total instances. The positive instances will be distributed uniformly as the circle in the center while negative instances are around over the domain. The random positive outcasts are also generated. The dataset is used to show the difference between datasets generated by each sampling technique.

Usage

sample_generator(n, ratio = 0.8, xlim = c(0, 1), ylim = c(0, 1),
   radius = 0.25, overlap = -0.05, outcast_ratio = 0.01)

Value

A 2-dimensional dataset with the 3rd column as its target class vector.

Arguments

n

The number of instances in the dataset

ratio

The ratio of negative instances to the total number of instances

xlim

The range of values in the first dimension

ylim

The range of values in the second dimension

radius

The radius of the circle of positive instances

overlap

The gap between the set of positive and negative instances

outcast_ratio

The ratio of outcast to be generate in this dataset.

Author

Wacharasak Siriseriwan <wacharasak.s@gmail.com>

Examples

Run this code
	data_example = sample_generator(5000,ratio = 0.80)
	plot(data_example[data_example[,3]=="n",1],
	data_example[data_example[,3]=="n",2],col="yellow")
	points(data_example[data_example[,3]=="p",1],
	data_example[data_example[,3]=="p",2],col="red",pch=14)

Run the code above in your browser using DataCamp Workspace