Function to Reweight Data
reweightData(
data,
argvals,
vars,
longvars = NULL,
weights,
index,
idvars = NULL,
compress = FALSE
)
a named list or data.frame.
character (vector); name(s) for entries in data giving
the index for observed grid points; must be supplied if vars
is not supplied.
character (vector); name(s) for entries in data, which
are subsetted according to weights or index. Must be supplied if argvals
is not supplied.
variables in long format, e.g., a response that is observed at curve specific grids.
vector of weights for observations. Must be supplied if index
is not supplied.
vector of indices for observations. Must be supplied if weights
is not supplied.
character (vector); index, which is needed to expand vars
to be conform
with the hmatrix
structure when using bhistx
-base-learners or to be conform with
variables in long format specified in longvars
.
logical; whether hmatrix
objects are saved in compressed form or not. Default is TRUE
.
Should be set to FALSE
when using reweightData
for nested resampling.
A list with the reweighted or subsetted data.
reweightData
indexes the rows of matrices and / or positions of vectors by using
either the index
or the weights
-argument. To prevent the function from indexing
the list entry / entries, which serve as time index for observed grid points of each trajectory of
functional observations, the argvals
argument (vector of character names for these list entries)
can be supplied. If argvals
is not supplied, vars
must be supplied and it is assumed that
argvals
is equal to names(data)[!names(data) %in% vars]
.
When using weights
, a weight vector of length N must be supplied, where N is the number of observations.
When using index
, the vector must contain the index of each row as many times as it shall be included in the
new data set.
# NOT RUN { ## load data data("viscosity", package = "FDboost") interval <- "101" end <- which(viscosity$timeAll == as.numeric(interval)) viscosity$vis <- log(viscosity$visAll[ , 1:end]) viscosity$time <- viscosity$timeAll[1:end] ## what does data look like str(viscosity) ## do some reweighting # correct weights str(reweightData(viscosity, vars=c("vis", "T_C", "T_A", "rspeed", "mflow"), argvals = "time", weights = c(0, 32, 32, rep(0, 61)))) str(visNew <- reweightData(viscosity, vars=c("vis", "T_C", "T_A", "rspeed", "mflow"), argvals = "time", weights = c(0, 32, 32, rep(0, 61)))) # check the result # visNew$vis[1:5, 1:5] ## image(visNew$vis) # incorrect weights str(reweightData(viscosity, vars=c("vis", "T_C", "T_A", "rspeed", "mflow"), argvals = "time", weights = sample(1:64, replace = TRUE)), 1) # supply meaningful index str(visNew <- reweightData(viscosity, vars = c("vis", "T_C", "T_A", "rspeed", "mflow"), argvals = "time", index = rep(1:32, each = 2))) # check the result # visNew$vis[1:5, 1:5] # errors if(FALSE){ reweightData(viscosity, argvals = "") reweightData(viscosity, argvals = "covThatDoesntExist", index = rep(1,64)) } # }