dbglm
is a variety of generalized linear model where explanatory
information is coded as distances between individuals. These distances
can either be computed from observed explanatory variables or directly
input as a squared inter-distances matrix.
Response and link function as in the glm
function for ordinary
generalized linear models.
Notation convention: in distance-based methods we must distinguish
observed explanatory variables which we denote by Z or z, from
Euclidean coordinates which we denote by X or x. For explanation
on the meaning of both terms see the bibliography references below.## S3 method for class 'formula':
dbglm(formula,data,family=gaussian,...,
metric="euclidean",weights,maxiter=100,eps1=1e-10,
eps2=1e-10,rel.gvar=0.95,eff.rank=NULL,offset,mustart=NULL)
# method for distance class 'dist' or 'dissimilary'
dbglm.dist(y,distance,family=gaussian,weights,
maxiter=100,eps1=1e-10,eps2=1e-10,rel.gvar=0.95,eff.rank=NULL,
offset,mustart=NULL,...)
# method for distance class 'D2'
dbglm.D2(y,D2,...,family=gaussian,weights,maxiter=100,
eps1=1e-10,eps2=1e-10,rel.gvar=0.95,eff.rank=NULL,offset,
mustart=NULL)
# method for class 'Gram'
dbglm.Gram(y,G,...,family=gaussian,weights,maxiter=100,
eps1=1e-10,eps2=1e-10,rel.gvar=0.95,eff.rank=NULL,
offset,mustart=NULL)
D2
class object. Squared distances matrix between individuals.
See the Details section in dblm
to learn the usage.Gram
class object. Doubly centered inner product matrix of the
squared distances matrix D2
. See details in dblm
.
"euclidean"
(the default), "manhattan"
,
or "gower"
.dblm
algorithm.
(Default = 100)"DevStat"
: convergence tolerance eps1
,
a positive (small) number;
the iterations converge when |dev - dev_{old}|/(|dev|) < eps1
.
Stationarity of deviance has been attained."mustat"
: convergence tolerance eps2
,
a positive (small) number;
the iterations converge when |mu - mu_{old}|/(|mu|) < eps2
.
Stationarity of fitted.values mu
has beedblm
iteration, take the lowest effective rank, with
a relative geometric variability higher or equal to rel.gvar
.
Default value (rel.gv
dblm
iteration. If specified its value overrides
rel.gvar
. When eff.rank=NULL
(defaultdbglm
containing the following components:working
residuals, that is the dblm
residuals in the last iteration of dblm
fit.dblm
iteration.family
object used.dblm
) iterations.working
weights, that are the weights in the
last iteration of dblm
fit."DevStat"
(stopping criterion 1), "muStat"
(stopping criterion 2),
"maxiter"
(maximum allowed number of iterations
has been exceeded).dblm
iteration.dblm
iteration.working
effective rank, that is the eff.rank
in the last dblm
iteration."dbglm"
are actually of class
c("dbglm", "dblm")
, inheriting the plot.dblm
method
from class "dblm"
.dblm
.
For gamma distributions, the domain of the canonical link function
is not the same as the permitted range of the mean. In particular,
the linear predictor might be negative, obtaining an impossible
negative mean. Should that event occur, dbglm
stops with
an error message. Proposed alternative is to use a non-canonical link
function.summary.dbglm
for summary.
plot.dbglm
for plots.
predict.dbglm
for predictions.
dblm
for distance-based linear models.## CASE POISSON
z <- rnorm(100)
y <- rpois(100, exp(1+z))
glm1<-glm(y ~z, family=poisson(link = "log"))
D2<-as.matrix(dist(z))^2
class(D2)<-"D2"
dbglm1<-dbglm.D2(y,D2,family=poisson(link = "log"))
plot(z,y)
points(z,glm1$fitted.values,col=2)
points(z,dbglm1$fitted.values,col=3)
sum((glm1$fitted.values-y)^2)
sum((dbglm1$fitted.values-y)^2)
## CASE BINOMIAL
y <- rbinom(100, 1, plogis(z))
# needs to set a starting value for the next fit
glm2<-glm(y ~z, family=binomial(link = "logit"))
D2<-as.matrix(dist(z))^2
class(D2)<-"D2"
dbglm2<-dbglm.D2(y,D2,family=binomial(link = "logit"))
plot(z,y)
points(z,glm2$fitted.values,col=2)
points(z,dbglm2$fitted.values,col=3)
sum((glm2$fitted.values-y)^2)
sum((dbglm2$fitted.values-y)^2)
Run the code above in your browser using DataLab