Learn R Programming

mt (version 2.0-1.20)

mv.util: Missing Value Utilities

Description

Functions to handle missing values of data set.

Usage

mv.stats(dat,grp=NULL,...) 
  
mv.fill(dat,method="mean",ze_ne = FALSE)

mv.zene(dat)

Value

mv.fill returns an imputed data frame.

mv.zene returns an NA-filled data frame.

mv.stats returns a list including the components:

  • mv.overall: Overall missng value rate.

  • mv.var: Missing value rate per variable (column).

  • mv.grp: A matrix of missing value rate for different groups if argument grp is given.

  • mv.grp.plot: An object of class trellis for plotting of mv.grp if argument grp is given.

Arguments

dat

A data frame or matrix of data set.

grp

A factor or vector of class.

method

Univariate imputation method for missing value. For details, see examples below.

ze_ne

A logical value indicating whether the zeros or negatives should be treated as missing values.

...

Additional parameters to mv.stats for plotting using lattice.

Author

Wanchang Lin

Examples

Run this code
data(abr1)
dat <- abr1$pos[,1970:1980]
cls <- factor(abr1$fact$class)

## fill zeros with NAs
dat <- mv.zene(dat)

## missing values summary
mv <- mv.stats(dat, grp=cls) 
plot(mv$mv.grp.plot)

## fill NAs with mean
dat.mean <- mv.fill(dat,method="mean")

## fill NAs with median
dat.median <- mv.fill(dat,method="median")

## -----------------------------------------------------------------------
## fill NAs with user-defined methods: two examples given here.
## a.) Random imputation function:
rand <- function(x,...) sample(x[!is.na(x)], sum(is.na(x)), replace=TRUE)

## test this function:
(tmp <- dat[,1])        ## an vector with NAs
## get the randomised values for NAs
rand(tmp)

## fill NAs with method "rand"
dat.rand <- mv.fill(dat,method="rand")

## b.) "Low" imputation function:
"low" <- function(x, ...) {
  max(mean(x,...) - 3 * sd(x,...), min(x, ...)/2)
}
## fill NAs with method "low"
dat.low <- mv.fill(dat, method="low") 

## summary of imputed data set
df.summ(dat.mean)

Run the code above in your browser using DataLab