fdt: Frequency Distribution Table

Description

A S3 set of methods to easily perform frequency distribution table (fdt) from vector, data.frame and matrix objects.

Usage

fdt(x, ...)

  ## S3 method for class 'default':
fdt(x, k, start, end, h, breaks=c("Sturges", "Scott", "FD"),
    right=FALSE, ...)
  ## S3 method for class 'data.frame':
fdt(x, k, by, breaks=c("Sturges", "Scott", "FD"),
    right=FALSE, ...)
  ## S3 method for class 'matrix':
fdt(x, k, breaks=c("Sturges", "Scott", "FD"),
    right=FALSE, ...)

Arguments

A numeric vector, data.frame or matrix object. If x is data.frame or matrix it must contain at least one numeric column.

Number of class intervals.

start

Left endpoint of the first class interval.

end

Right endpoint of the last class interval.

Class interval width.

Categorical variable used for grouping each numeric variable, useful only on data.frames.

breaks

Method used to determine the number of interval classes, c("Sturges", "Scott", "FD").

right

Right endpoints open (default = FALSE).

...

Potencial further arguments (required by generic).

Value

The method fdt.default returns a list of class fdt.default with the slots:
tableA data.frame storing the fdt.
breaksA vector of length 4 storing start, end, h and right of the fdt generated by this method.
dataA vector of the data x provided.
The methods fdt.data.frame and fdt.matrix return a list of class fdt.multiple. This list has one slot for each numeric variable of the x provided. Each slot, corresponding to each numeric variable, stores the same slots of the fdt.default described above.

Details

The simplest way to run fdt is done by supplying only the x object, for example: d <- fdt(x). In this case all necessary default values (breaks and right) ("Sturges" and FALSE respectivelly) will be used. It can also be provided: a) x and k; b) x, start and end; or c) x, start, end and h. These options make the fdt very easy and flexible to use. The fdt object stores information to be used by methods summary, print and plot. The result of plot is a histogram. The methods summary, print and plot provide a reasonable set of parameters to format and plot the fdt object in a pretty (and publishable) way.

Examples

Run this code

library(fdth)

#======================
# Vectors: univariated
#======================
set.seed(1); x <- rnorm(n=1e3, mean=5, sd=1)

# x
d <- fdt(x); d

# x, alternative breaks
d <- fdt(x, breaks='Scott'); d

# x, k
d <- fdt(x, k=20); d

# x, star, end
range(x)
d <- fdt(x, start=1.5, end=9); d

# x, start, end, h
d <- fdt(x, start=1, end=9, h=1); d

# Effect of right
x <- rep(1:3, 3); sort(x)
d <- fdt(x, start=1, end=4, h=1); d

d <- fdt(x, start=0, end=3, h=1, right=TRUE); d

#=============================================
# Data.frames: multivariated with categorical
#=============================================
mdf <- data.frame(X1 = rep(LETTERS[1:4], 25),
                  X2 = as.factor(rep(1:10, 10)),
                  Y1 = c(NA, NA, rnorm(96, 10, 1), NA, NA),
                  Y2 = rnorm(100, 60, 4),
                  Y3 = rnorm(100, 50, 4),
                  Y4 = rnorm(100, 40, 4))

d <- fdt(mdf); d

levels(mdf$X1)
d <- fdt(mdf, k=5, by='X1'); d

d <- fdt(mdf, breaks='FD', by='X1')
str(d)
d

levels(mdf$X2)
d <- fdt(mdf, breaks='FD', by='X2'); d

d <- fdt(mdf, k=5, by='X2'); d

d <- fdt(iris, k=5); d

d <- fdt(iris, k=10); d

levels(iris$Species)
d <- fdt(iris, k=5, by='Species'); d

#=========================
# Matrices: multivariated
#=========================
d <-fdt(state.x77); d

d <-fdt(volcano); d

Run the code above in your browser using DataLab