Learn R Programming

manydist (version 0.4.5)

Unbiased Distances for Mixed-Type Data

Description

A comprehensive framework for calculating unbiased distances in datasets containing mixed-type variables (numerical and categorical). The package implements a general formulation that ensures multivariate additivity and commensurability, meaning that variables contribute equally to the overall distance regardless of their type, scale, or distribution. Supports multiple distance measures including Gower's distance, Euclidean distance, Manhattan distance, and various categorical variable distances such as simple matching, Eskin, occurrence frequency, and association-based distances. Provides tools for variable scaling (standard deviation, range, robust range, and principal component scaling), and handles both independent and association-based category dissimilarities. Implements methods to correct for biases that typically arise from different variable types, distributions, and number of categories. Particularly useful for cluster analysis, data visualization, and other distance-based methods when working with mixed data. Methods based on van de Velden et al. (2024) "Unbiased mixed variables distance".

Copy Link

Version

Install

install.packages('manydist')

Monthly Downloads

254

Version

0.4.5

License

GPL-3

Maintainer

Angelos Markos

Last Published

July 2nd, 2025

Functions in manydist (0.4.5)

mdist

Calculation of Pairwise Distances for Mixed-Type Data
cdist

Calculation of Pairwise Distances for Categorical Data
ndist

Calculation of Pairwise Distances for Continuous Data
fifa_nl

FIFA 21 Player Data - Dutch League