Learn R Programming

⚠️There's a newer version (3.0.2) of this package.Take me there.

dprep (version 2.1)

Data preprocessing and visualization functions for classification

Description

Functions for normalization, handling of missing values, discretization, outlier detection, feature selection, and visualization

Copy Link

Version

Install

install.packages('dprep')

Monthly Downloads

23

Version

2.1

License

GPL

Maintainer

Edgar Acuna

Last Published

August 21st, 2009

Functions in dprep (2.1)

lofactor

Local Outlier Factor
censusn

The census dataset
robout

Outlier Detection with Robust Mahalonobis distance
ce.impute

Imputation in supervised classification
midpoints

Auxiliary function for computing minimun entropy discretization
cv10mlp

10-fold cross validation error estimation for the multilayer perceptron classifier
colon

Alon et al.'s colon dataset
near2

Auxiliary function for the reliefcat function
chiMerge

Discretization using the Chi-Merge method
breastw

The Breast Wisconsin dataset
ec.knnimp

KNN Imputation
eje1dis

Basic example for discriminant analysis
baysout

Outlier detection using Bay and Schwabacher's algorithm.
cv10lda2

Auxiliary function for sequential forward selection
assig

Auxiliary function for computing the minimun entropy discretization
ce.knn.imp

Function that calls ec.knnimp to perform knn imputation
parallelplot

Parallel Coordinate Plot
disc.ef

Discretization using the method of equal frequencies
my.iris

The Iris dataset
imagmiss

Visualization of Missing Data
cv10rpart2

Auxiliary function for sequential feature selection
bupa

The Bupa dataset
rangenorm

range normalization
diabetes

The Pima Indian Diabetes dataset
crossval

Cross validation estimation of the misclassification error
closest

Auxiliary function used in the function baysout
knneigh.vect

Auxiliary function for computing the LOF measure.
heartc

The Heart Cleveland dataset
cv10knn2

Auxiliary function for sequential feature selection
cv10log

10-fold cross validation estimation error for the classifier based on logistic regression
relief

RELIEF Feature Selection
disc.1r

Discretization using the Holte's 1R method
maxdist

Auxiliary function used when executing the Bay's algorithm for outlier detection
decscale

Decimal Scaling
maxlof

Detection of multivariate outliers using the LOF algorithm
combinations

Constructing distinct permutations
clean

Dataset Cleaning
disc2

Auxiliary function for performing discretization using equal frequency
ce.mimp

Mean or median imputation
ionosphere

The Ionosphere dataset
outbox

Detecting outliers through boxplots of the features.
disc.ew

Discretization using the equal width method
circledraw

circledraw
finco

FINCO Feature Selection Algorithm
mardia

The Mardia's test of normality
dist.to.knn

Auxiliary function for the LOF algorithm.
hawkins

The Hawkins-Bradu-Kass dataset
pp.golub

The preprocessed Golub's dataset
disc.mentr

Discretization using the minimum entropy criterion
distan2

Auxiliary function used by the RELIEF function in the dprep library.
reliefcont

Feature selection by the Relief Algorithm for datasets with only continuous features
srbct

Khan et al.'s small round blood cells dataset
inconsist

Computing the inconsistency measure
softmaxnorm

Softmax Normalization
hepatitis

The hepatitis dataset
dprep-package

Data Preprocessing for Supervised Classification
reachability

Function for computing the reachability measure in the LOF algortihm
vvalen

The Van Valen test for equal covariance matrices
discretevar

Performs Minimum Entropy discretization for a given attribute
vvalen1

Auxiliary function for computing the Van Valen's homocedasticity test
distancia

Vector-Vector Euclidiean Distance Function
mo4

The fourth moment of a multivariate distribution
sffs

Sequential Floating Forward Method
nnmiss

Auxiliary function for knn imputation
vehicle

The Vehicle dataset
score

Score function used in Bay's algorithm for outlier detection
moda

Calculating the Mode
near1

Auxiliary function for the reliefcont function
mo3

The third moment of a multivariate distribution
radviz2d

Radial Coordinate Visualization
mmnorm

Min-max normalization
reliefcat

Feature selection by the Relief Algorithm for datasets with only nominal features
znorm

Z-score normalization
top

Auxiliary function for Bay's Ouylier Detection Algorithm
sfs

Sequential Forward Selection
redundancy

Finding the unique observations in a dataset along with their fequencies
sbs1

One-step sequential backward selection
surveyplot

Surveyplot
mahaout

Multivariate outlier detection through the boxplot of the Mahalanobis distance
lvf

Las Vegas Filter
tchisq

Auxiliary function for the Chi-Merge discretization
sonar

The Sonar dataset
sfs1

One-step sequential forward selection
row.matches

Finding rows in a matrix equal to a given vector
signorm

Sigmoidal Normalization
starcoord

The star coordinates plot