ldblm is a localized version of a distance-based linear model.
As in the global model dblm, explanatory information is coded as
distances between individuals.
Neighborhood definition for localizing is done by the (semi)metric
dist1 whereas a second (semi)metric dist2 (which may coincide
with dist1) is used for distance-based prediction.
Both dist1 and dist2 can either be computed from observed
explanatory variables or directly input as a squared interdistances
matrix or as a Gram matrix. The response is a continuous variable
as in the ordinary linear model. The model allows for a mixture of
continuous and qualitative explanatory variables or, in fact, from more
general quantities such as functional data.
Notation convention: in distance-based methods we must distinguish
observed explanatory variables which we denote by Z or z, from
Euclidean coordinates which we denote by X or x. For explanation
on the meaning of both terms see the bibliography references below.## S3 method for class 'formula':
ldblm(formula,data,...,kind.of.kernel=1,
metric1="euclidean",metric2=metric1,method="GCV",weights,
user_h=NULL,h.range=NULL,noh=10,k.knn=3,rel.gvar=0.95,eff.rank=NULL)
# method for distance class 'dist' or 'dissimilary'
ldblm.dist(y,dist1,dist2=dist1,kind.of.kernel=1,
method="GCV",weights,user_h=quantile(dist1,.25)^.5,
h.range=quantile(as.matrix(dist1),c(.05,0.5))^.5,noh=10,
k.knn=3,rel.gvar=0.95,eff.rank=NULL,...)
# method for distance class 'D2'
ldblm.D2(y,D2_1,D2_2=D2_1,kind.of.kernel=1,method="GCV",
weights,user_h=NULL,h.range=NULL,noh=10,k.knn=3,rel.gvar=0.95,
eff.rank=NULL,...)
# method for class 'Gram'
ldblm.Gram(y,G1,G2=G1,kind.of.kernel=1,method="GCV",
weights,user_h=NULL,h.range=NULL,noh=10,k.knn=3,rel.gvar=0.95,
eff.rank=NULL,...)dist or dissimilarity class object.
Distances between observations, used for neighborhood localizing
definition. Weights for observations are computed as a decreasing
function of their dist1 distancesdist or dissimilarity class object.
Distances between observations, used for fitting dblm.
Default dist2=dist1.D2 class object. Squared distances matrix between individuals.
One of the alternative ways of entering distance information
to a function. See the Details section in dblm.
See aboveD2 class object. Squared distances between observations.
One of the alternative ways of entering distance information
to a function. See the Details section in dblm.
See above dGram class object. Doubly centered inner product matrix
associated with the squared distances matrix D2_1.Gram class object. Doubly centered inner product matrix
associated with the squared distances matrix D2_2.
Default G2=G1dist1 from observed
explanatory variables.
One of "euclidean" (default), "manhattan",
or "gower".dist2 from observed
explanatory variables.
One of "euclidean" (default), "manhattan",
or "gower".AIC, BIC, OCV,
GCV (default) and user_h.
OCV and GCVuser_h, set by the user, controlling the size
of the local neighborhood of Z.
Smoothing parameter (Default: 1st quartile of all the distances
d(i,j) in dist1). Applies only if method="user_dist1).h values within h.range for
automatic bandwidth choice (if method!="user_h").k.nn=3.dblm iteration, take the lowest effective rank, with
a relative geometric variability higher or equal to rel.gvar.
Default value (rel.gvdblm iteration. If specified its value overrides
rel.gvar. When eff.rank=NULL (defaultldblm containing the following components:if method!=user_h)."D2" or "dist") used to calculate the weights of the observations."D2" or "dist") used to fit the dblm.dist1 and dist2. Both semi-metrics can coincide.
For instance, when dist1=||xi-xj|| and
dist2=||(xi,xi^2,xi^3)-(xj,xj^2,xj^3)|| the estimator
for new observations coincides with fitting a local cubic polynomial
regression.
The set of bandwidth h values checked in automatic
bandwidth choice is defined by h.range and noh,
together with k.knn. For each h in it a local linear
model is fitted and the optimal h is decided according to the
statistic specified in method.
kind.of.kernel designates which kernel function is to be used
in determining individual weights from dist1 values.
See density for more information.dblm for distance-based linear models.
ldbglm for local distance-based generalized linear models.
summary.ldblm for summary.
plot.ldblm for plots.
predict.ldblm for predictions.# example to use of the ldblm function
n <- 100
p <- 1
k <- 5
Z <- matrix(rnorm(n*p),nrow=n)
b1 <- matrix(runif(p)*k,nrow=p)
b2 <- matrix(runif(p)*k,nrow=p)
b3 <- matrix(runif(p)*k,nrow=p)
s <- 1
e <- rnorm(n)*s
y <- Z%*%b1 + Z^2%*%b2 +Z^3%*%b3 + e
D2<-as.matrix(dist(Z)^2)
class(D2)<-"D2"
ldblm1<-ldblm(y~Z,kind.of.kernel=1,method="GCV",noh=3,k.knn=3)
ldblm2<-ldblm.D2(y,D2_1=D2,D2_2=D2,kind.of.kernel=1,method="user_h",k.knn=3)Run the code above in your browser using DataLab