dbstats-package: Distance-based statistics (dbstats)
Description
This package contains functions for distance-based prediction methods.
These are methods for prediction where predictor information is coded
as a matrix of distances between individuals.
In the currently implemented methods the response is a univariate variable
as in the ordinary linear model or in the generalized linear model.
Distances can either be directly input as an interdistances matrix,
a squared interdistances matrix, an inner-products matrix
(see GtoD2
) or computed from observed
explanatory variables.
Notation convention: in distance-based methods we must distinguish
observed explanatory variables which we denote by Z or z, from
Euclidean coordinates which we denote by X or x. For explanation
on the meaning of both terms see the bibliography references below.
Observed explanatory variables z are possibly a mixture of continuous and
qualitative explanatory variables or more general quantities.
dbstats does not provide specific functions for computing distances,
depending instead on other functions and packages, such as:
dist
in thestatspackage.dist
in theproxypackage. When theproxypackage is loaded, itsdist
function
supersedes the one in thestatspackage.daisy
in theclusterpackage.
Compared to both instances ofdist
above whose input must be
numeric variables, the main feature ofdaisy
is
its ability to handle other variable types as well (e.g. nominal, ordinal,
(a)symmetric binary) even when different types occur in the same data set.
Actually the last statement is not hundred percent true: it refers only to
the default behaviour of bothdist
functions, whereas thedist
function in theproxypackage can
evaluate distances between observations with a user-provided function,
entered as a parameter, hence it can deal with any type of data. See the
examples inpr_DB
.
Functions of dbstats package:
Linear and local linear models with a continuous response:
dblm
for distance-based linear models.ldblm
for local distance-based linear models.dbplsr
for distance-based partial least squares.
Generalized linear and local generalized linear models with a numeric response:
dbglm
for distance-based generalized linear models.ldblm
for local distance-based generalized linear models.
Details
ll{
Package: dbstats
Type: Package
Version: 1.0.1
Date: 2011-06-21
License: GPL-2
LazyLoad: yes
}References
Boj E, Delicado P, Fortiana J (2010). Distance-based local linear regression for functional predictors.
Computational Statistics and Data Analysis 54, 429-437.
Boj E, Grane A, Fortiana J, Claramunt MM (2007). Implementing PLS for distance-based regression:
computational issues.
Computational Statistics 22, 237-248.
Boj E, Grane A, Fortiana J, Claramunt MM (2007). Selection of predictors in distance-based regression.
Communications in Statistics B - Simulation and Computation 36, 87-98.
Cuadras CM, Arenas C, Fortiana J (1996). Some computational aspects of a distance-based model
for prediction. Communications in Statistics B - Simulation and Computation 25, 593-609.
Cuadras C, Arenas C (1990). A distance-based regression model for prediction with mixed data.
Communications in Statistics A - Theory and Methods 19, 2261-2279.
Cuadras CM (1989). Distance analysis in discrimination and classification using both
continuous and categorical variables. In: Y. Dodge (ed.), Statistical Data Analysis and Inference.
Amsterdam, The Netherlands: North-Holland Publishing Co., pp. 459-473.