Learn R Programming

brokenstick (version 1.1.0)

predict.brokenstick: Predict from a brokenstick model

Description

The predictions from a broken stick model coincide with the group-conditional means of the random effects. This function takes an object of class brokenstick and returns predictions in one of several formats. The user can calculate predictions for new persons, i.e., for persons who are not part of the fitted model, through the x and y arguments.

Usage

# S3 method for brokenstick
predict(
  object,
  new_data = NULL,
  type = "numeric",
  ...,
  x = NULL,
  y = NULL,
  group = NULL,
  strip_data = TRUE,
  shape = c("long", "wide", "vector")
)

Arguments

object

A brokenstick object.

new_data

A data frame or matrix of new predictors.

type

A single character. The type of predictions to generate. Valid options are:

  • "numeric" for numeric predictions.

...

Not used, but required for extensibility.

x

Optional. A numeric vector with values of the predictor. It could also be the special keyword x = "knots" replaces x by the positions of the knots.

y

Optional. A numeric vector with measurements.

group

A vector with group identifications

strip_data

A logical indicating whether the row with the observed data from new_data should be stripped from the return. The default is TRUE. Set to FALSE to infer which data points are extracted from new_data.

shape

A string: "long" (default), "wide" or "vector" specifying the shape of the return value. Note that use of "wide" with many unique values in x creates an unwieldy, large and sparse matrix.

Value

A tibble of predictions. If x, y and group are not specified, the number of rows in the tibble is guaranteed to be the same as the number of rows in new_data.

Details

By default, predict() calculates predictions for every row in new_data. It is possible to tailor the behavior through the x, y and group arguments. What exactly happens depends on which of these arguments is specified:

  1. If the user specifies x, but no y and group, the function returns - for every group in new_data - predictions at x values. This method will use the data from new_data.

  2. If the user specifies x and y but no group, the function forms a hypothetical new group with the x and y values. This method uses no information from new_data.

  3. If the user specifies group, but no x or y, the function searches for the relevant data in new_data and limits its predictions to the specified groups. This is useful if prediction for only one or a few groups is needed.

  4. If the user specifies x and group, but no y, the function will create new values for x in each group, search for the relevant data in new_data and limit prediction to locations x in those groups.

  5. If the user specifies x, y and group, the functions assumes that these vectors form a data frame. The lengths of x, y and group must be the same. This procedure uses only information from new_data for groups with group values that match those on newdata.

  6. As case 5, but now without a new_data argument. All data are specified through x, y and group. No matching to new_data attempted.

Examples

Run this code
# NOT RUN {
train <- smocc_200[1:1198, ]
test <- smocc_200[1199:1940, ]

# Fit
fit <- brokenstick(hgt.z ~ age | id, data = train, knots = 0:3)

# Predict, with preprocessing
tail(predict(fit, test), 3)

# case 1: x as knots
z <- predict(fit, test, x = "knots")

# case 2: x and y, one new group
predict(fit, test, x = "knots", y = c(1, 1, 0.5, 0))

# case 2: x and y, one new group, we need not specify new_data
predict(fit, x = "knots", y = c(1, 1, 0.5, 0))

# case 3: only group
predict(fit, test, group = c(11045, 11120, 999))

# case 4: predict at x in selected groups
predict(fit, test, x = c(0.5, 1, 1.25), group = c(11045, 11120, 999))

# case 5: vectorized
predict(fit, test, x = c(0.5, 1, 1.25), y = c(0, 0.5, 1), group = c(11045, 11120, 999))

# case 6: vectorized, without new_data, results are different for 11045 and 11120
predict(fit, x = c(0.5, 1, 1.25), y = c(0, 0.5, 1), group = c(11045, 11120, 999))
# }

Run the code above in your browser using DataLab