Learn R Programming

kmed (version 0.4.2)

distNumeric: A pair distance for numerical variables

Description

This function computes a pairwise numerical distance between two numerical data sets.

Usage

distNumeric(x, y, method = "mrw", xyequal = TRUE)

Value

Function returns a distance matrix with the number of rows equal to the number of objects in the x matrix (\(n_x\)) and the number of columns equals to the number of objects in the y matrix (\(n_y\)).

Arguments

x

A first data matrix (see Details).

y

A second data matrix (see Details).

method

A method to calculate the pairwise numerical distance (see Details).

xyequal

A logical if x is equal to y (see Details).

Author

Weksi Budiaji
Contact: budiaji@untirta.ac.id

Details

The x and y arguments have to be matrices with the same number of columns where the row indicates the object and the column is the variable. This function calculate all pairwise distance between rows in the x and y matrices. Although it calculates a pairwise distance between two data sets, the default function computes all distances in the x matrix. If the x matrix is not equal to the y matrix, the xyequal has to be set FALSE.

The method available are mrw (Manhattan weighted by range), sev (squared Euclidean weighted by variance), ser (squared Euclidean weighted by range), ser.2 (squared Euclidean weighted by squared range) and se (squared Euclidean). Their formulas are: $$mrw_{ij} = \sum_{r=1}^{p_n} \frac{|x_{ir} - x_{jr}|}{R_r}$$ $$sev_{ij} = \sum_{r=1}^{p_n} \frac{(x_{ir} - x_{jr})^2}{s_r^2}$$ $$ser_{ij} = \sum_{r=1}^{p_n} \frac{(x_{ir} - x_{jr})^2}{ R_r }$$ $$ser.2_{ij} = \sum_{r=1}^{p_n} \frac{(x_{ir} - x_{jr})^2}{ R_r^2 }$$ $$se_{ij} = \sum_{r=1}^{p_n} (x_{ir} - x_{jr})^2$$ where \(p_n\) is the number of numerical variables, \(R_r\) is the range of the r-th variables, \(s_r^2\) is the variance of the r-th variables.

Examples

Run this code
num <- as.matrix(iris[,1:4])
mrwdist <- distNumeric(num, num, method = "mrw")
mrwdist[1:6,1:6]

Run the code above in your browser using DataLab