Regression functions such as lm
typically specify the data
to be used based on a formula
(and optional further arguments)
as in lm(formula, ...)
. The regression function then typically
calls the generic function model.frame
to convert the formula
(and related arguments) to a data frame which forms the basic data used
by the regression.
In such a situation, the formula
argument of the regression
function is passed to model.frame
so the class of the
formula
determines which method of model.frame
is used for
data handling by the regression function. Normally the class of formula
is
"formula"
which causes model.frame.formula
to be called. In time
series regression (and potentially in other situations as well), a more
specialized model.frame
method should be called depending on the
class of the dependent variable. For this alternate form of dispatch,
model.frame.AsIs
is introduced: by insulating the formula
argument in I(formula)
the class is changed to "AsIs"
(leaving formula
unchanged) so that model.frame.AsIs
is dispatched. mode.frame.AsIs
does no processing of its own other than
to examine the dependent variable of the formula and redispatch according
to its class.
Thus, if the dependent variable specified in I(formula)
is of
class "foo"
the method model.frame.foo
will be called for
handling the data.
If the dependent variable in such a model is of class "zoo"
then
model.frame.zoo
will be called. Its key role is to inspect a formula
that may contain only zoo
objects as variables and transform
it to a model frame that can be used in various regression functions
appropriately aligning the various series. If the "zoo"
series
should be specified using the data
argument, such argument can be
a list of "zoo"
objects, a single zoo object, or
a data frame of "zoo"
objects. Similarly, a model.frame.ts
method is provided for "ts"
objects. Note, that despite their names
these methods do not expect a normal "zoo"
or "ts"
object
as their respective first argument but rather they expect a formula
(whose dependent variable is of class
"zoo"
or "ts"
respectively).
Their behaviour is essentially the same as in the default
model.frame
method, but they retain the index/time information. Furthermore, they
enable the user to use diff
and lag
in the
model specification.
As many regression functions in R use the same steps to extract the
data from a specified formula
, this approach modularizes the
data management and regression based on "zoo"
objects making it
available in various regression functions. Hence, the user will
usually not have to call any of the model.frame
functions explicitly
but only has to insulate the formula
with I()
.
See the examples for an illustration.
The regression functions for which this approach is known to work
includes lm
,
glm
, lrm
, lqs
,
nnet
, svm
,
rq
,
randomForest
and possibly many others.
IMPORTANT: Note, that this feature is under development and might
change in future versions.