bfsl
calculates the best-fit straight line to independent points with
(possibly correlated) normally distributed errors in both coordinates.
bfsl(...)# S3 method for default
bfsl(x, y = NULL, sd_x = 0, sd_y = 1, r = 0, control = bfsl_control(), ...)
# S3 method for formula
bfsl(
formula,
data = parent.frame(),
sd_x,
sd_y,
r = 0,
control = bfsl_control(),
...
)
Further arguments passed to or from other methods.
A vector of x observations or a data frame (or an
object coercible by as.data.frame
to a data frame) containing
the named vectors x, y, and optionally sd_x,
sd_y and r. If weights w_x and w_y are given,
then sd_x and sd_y are calculated from sd_x = 1/sqrt(w_x)
and sd_y = 1/sqrt(w_y). Specifying y
, sd_x
, sd_y
or r
directly as function arguments overwrites these variables in the
data structure.
A vector of y observations.
A vector of x measurement error standard deviations. If it is of length one, all data points are assumed to have the same x standard deviation.
A vector of y measurement error standard deviations. If it is of length one, all data points are assumed to have the same y standard deviation.
A vector of correlation coefficients between errors in x and y. If it is of length one, all data points are assumed to have the same correlation coefficient.
A list of control settings. See bfsl_control
for the names of the settable control values and their effect.
A formula specifying the bivariate model (as in lm
,
but here only y ~ x
makes sense).
A data.frame containing the variables of the model.
An object of class "bfsl
", which is a list
containing
the following components:
A 2x2
matrix with columns of the fitted coefficients
(intercept and slope) and their standard errors.
The goodness of fit (see Details).
The fitted mean values.
The residuals, that is y
observations minus fitted values.
The residual degrees of freedom.
The covariance of the slope and intercept.
The control list
used, see the control
argument.
A list
with convergence information.
The matched call.
A list
containing x
, y
, sd_x
, sd_y
and r
.
bfsl
provides the general least-squares estimation solution to the
problem of fitting a straight line to independent data with (possibly
correlated) normally distributed errors in both x
and y
.
With sd_x = 0
the (weighted) ordinary least squares solution is
obtained. The calculated standard errors of the slope and intercept
multiplied with sqrt(chisq)
correspond to the ordinary least squares
standard errors.
With sd_x = c
, sd_y = d
, where c
and d
are
positive numbers, and r = 0
the Deming regression solution is obtained.
If additionally c = d
, the orthogonal distance regression solution,
also known as major axis regression, is obtained.
Setting sd_x = sd(x)
, sd_y = sd(y)
and r = 0
leads to
the geometric mean regression solution, also known as reduced major
axis regression or standardised major axis regression.
The goodness of fit metric chisq
is a weighted reduced chi-squared
statistic. It compares the deviations of the points from the fit line to the
assigned measurement error standard deviations. If x
and y
are
indeed related by a straight line, and if the assigned measurement errors
are correct (and normally distributed), then chisq
will equal 1. A
chisq > 1
indicates underfitting: the fit does not fully capture the
data or the measurement errors have been underestimated. A chisq < 1
indicates overfitting: either the model is improperly fitting noise, or the
measurement errors have been overestimated.
York, D. (1968). Least squares fitting of a straight line with correlated errors. Earth and Planetary Science Letters, 5, 320<U+2013>324, https://doi.org/10.1016/S0012-821X(68)80059-7
# NOT RUN {
x = pearson_york_data$x
y = pearson_york_data$y
sd_x = 1/sqrt(pearson_york_data$w_x)
sd_y = 1/sqrt(pearson_york_data$w_y)
bfsl(x, y, sd_x, sd_y)
bfsl(y~x, pearson_york_data, sd_x, sd_y)
fit = bfsl(pearson_york_data)
plot(fit)
# }
Run the code above in your browser using DataLab