Learn Python and AI for free! One week only. No credit card needed.
Ends in:
Estimate transformations of x
and y
such that
the regression of y
on x
is approximately linear with
constant variance
avas(...)# S3 method for default
avas(
x,
y,
wt = NULL,
cat = NULL,
mon = NULL,
lin = NULL,
circ = NULL,
delrsq = 0.01,
yspan = 0,
control = NULL,
...
)
# S3 method for formula
avas(
formula,
data = NULL,
subset = NULL,
na.action = getOption("na.action"),
...
)
# S3 method for avas
summary(object, ...)
# S3 method for avas
print(x, ..., digits = 4)
# S3 method for avas
plot(
x,
...,
which = 1:(x$p + 1),
caption = c(list("Response Y AVAS Transformation"), as.list(paste("Carrier",
rownames(x$x), "AVAS Transformation"))),
xlab = "Original",
ylab = "Transformed",
ask = prod(par("mfcol")) < length(which) && dev.interactive()
)
A structure with the following components:
the input x matrix.
the input y vector.
the transformed x values.
the transformed y values.
the multiple R-squared value for the transformed values.
the codes for cat, mon, ...
not used in this version of avas
span used for smoothing the variance
iteration number and rsq for that iteration
number of iterations used
additional arguments which go ignored for avas call. Included for S3 dispatch consistency. They are utilized when using print as they get passed to cat. Also when plotting an ace object they are passed to plot.
matrix containing the independent variables.
a vector containing the response variable.
an optional vector of weights.
an optional integer vector specifying which variables
assume categorical values. Positive values in cat
refer
to columns of the x
matrix and zero to the response
variable. Variables must be numeric, so a character variable
should first be transformed with as.numeric() and then specified
an optional integer vector specifying which variables are
to be transformed by monotone transformations. Positive values
in mon
refer to columns of the x
matrix and zero
to the response variable.
an optional integer vector specifying which variables are
to be transformed by linear transformations. Positive values in
lin
refer to columns of the x
matrix and zero to
the response variable.
an integer vector specifying which variables assume
circular (periodic) values. Positive values in circ
refer to columns of the x
matrix and zero to the response
variable.
numeric(1); Termination threshold for iteration. Stops when
R-squared changes by less than delrsq
in 3 consecutive iterations
(default 0.01).
yspan Optional window size parameter for smoothing the
variance. Range is
named list; control parameters to set. Documented at
set_control
.
formula; an object of class "formula
": a
symbolic description of the model to be smoothed.
an optional data frame, list or environment (or object coercible
by as.data.frame
to a data frame) containing the variables in
the model. If not found in data, the variables are taken from
environment(formula)
, typically the environment from which
ace
is called.
an optional vector specifying a subset of observations to be
used in the fitting process. Only used when a formula
is specified.
a function which indicates what should happen when the data
contain NAs. The default is set by the na.action
setting of
options
, and is na.fail
if that is unset.
The ‘factory-fresh’ default is na.omit
. Another possible
value is NULL, no action. Value na.exclude
can be useful.
an S3 ace object
rounding digits for summary/print
when plotting an ace object which plots to produce.
a list of captions for a plot.
the x-axis label when plotting.
the y-axis label when plotting.
when plotting should the terminal be asked for input between plots.
Rob Tibshirani (1987), ``Estimating optimal transformations for regression''. Journal of the American Statistical Association 83, 394ff.
TWOPI <- 8*atan(1)
x <- runif(200,0,TWOPI)
y <- exp(sin(x)+rnorm(200)/2)
a <- avas(x,y)
plot(a) # View response and carrier transformations
plot(a$tx,a$ty) # examine the linearity of the fitted model
# From D. Wang and M. Murphy (2005), Identifying nonlinear relationships
# regression using the ACE algorithm. Journal of Applied Statistics,
# 32, 243-258, adapted for avas.
X1 <- runif(100)*2-1
X2 <- runif(100)*2-1
X3 <- runif(100)*2-1
X4 <- runif(100)*2-1
# Original equation of Y:
Y <- log(4 + sin(3*X1) + abs(X2) + X3^2 + X4 + .1*rnorm(100))
# Transformed version so that Y, after transformation, is a
# linear function of transforms of the X variables:
# exp(Y) = 4 + sin(3*X1) + abs(X2) + X3^2 + X4
a1 <- avas(cbind(X1,X2,X3,X4),Y)
par(mfrow=c(2,1))
# For each variable, show its transform as a function of
# the original variable and the of the transform that created it,
# showing that the transform is recovered.
plot(X1,a1$tx[,1])
plot(sin(3*X1),a1$tx[,1])
plot(X2,a1$tx[,2])
plot(abs(X2),a1$tx[,2])
plot(X3,a1$tx[,3])
plot(X3^2,a1$tx[,3])
plot(X4,a1$tx[,4])
plot(X4,a1$tx[,4])
plot(Y,a1$ty)
plot(exp(Y),a1$ty)
Run the code above in your browser using DataLab