trajMeasures: Compute Measures for Identifying Patterns of Change in Longitudinal Data

Description

trajMeasures computes up to 20 measures for each longitudinal trajectory. See Details for the list of measures.

Usage

trajMeasures(
  Data,
  Time = NULL,
  ID = FALSE,
  measures = c(1:10, 12:20),
  midpoint = NULL,
  cap.outliers = FALSE
)
# S3 method for trajMeasures
print(x, ...)
# S3 method for trajMeasures
summary(object, ...)

Value

An object of class trajMeasures; a list containing the values of the measures, a table of the outliers which have been capped, as well as a curated form of the function's arguments.

Arguments

Data: a matrix or data frame in which each row contains the longitudinal data (trajectories).
Time: either NULL, a vector or a matrix/data frame of the same dimension as Data. If a vector, matrix or data frame is supplied, its entries are assumed to be measured at the times of the corresponding cells in Data. When set to NULL (the default), the times are assumed equidistant.
ID: logical. Set to TRUE if the first columns of Data and Time corresponds to an ID variable identifying the trajectories. Defaults to FALSE.
measures: a vector containing the numerical identifiers of the measures to compute. The default, c(1:10,12:20), excludes the measure which require specifying a midpoint.
midpoint: specifies which column of Time to use as the midpoint in measure 11 Can be NULL, an integer or a vector of integers of length the number of rows in Time. The default is NULL, in which case the midpoint is the time closest to the median of the Time vector specific to each trajectory.
cap.outliers: logical. If TRUE, extreme values of the measures will be capped. Defaults to FALSE.
x: object of class trajMeasures.
...: further arguments passed to or from other methods.
object: object of class trajMeasures.

Details

Each trajectory must have a minimum of 3 observations, otherwise it is omitted from the analysis. The 20 measures and their numerical identifiers are listed below. Please refer to the vignette for the specific formulas used to compute them.

Maximum
Minimum
Range
Mean value
Standard deviation
Slope of the affine approximation
Intercept of the affine approximation
Proportion of variance explained by the affine approximation
Rate of intersection with the best affine approximation
Net variation per unit of time
Late variation to early variation contrast
Total variation per unit time
Spikiness
Maximum of the first derivative
Minimum of the first derivative
Standard deviation of the first derivative
First derivative's net variation per unit of time
Maximum of the second derivative
Minimum of the second derivative
Standard deviation of the second derivative

If 'cap.outliers' is set to TRUE, Nishiyama's improved Chebychev bound for continuous distributions is used to determine extreme values for each measure, corresponding to a 0.3% probability threshold. Extreme values beyond the threshold are then capped to the 0.3% probability threshold (see vignette for more details).

References

Leffondre K, Abrahamowicz M, Regeasse A, Hawker GA, Badley EM, McCusker J, Belzile E. Statistical measures were proposed for identifying longitudinal patterns of change in quantitative health indicators. J Clin Epidemiol. 2004 Oct;57(10):1049-62. doi: 10.1016/j.jclinepi.2004.02.012. PMID: 15528056.

Nishiyama T, Improved Chebyshev inequality: new probability bounds with known supremum of PDF, arXiv:1808.10770v2 stat.ME https://doi.org/10.48550/arXiv.1808.10770

Examples

Run this code

if (FALSE) {
data("trajdata")
trajdata.noGrp <- trajdata[, -which(colnames(trajdata) == "Group")] #remove the Group column

m1 = trajMeasures(trajdata.noGrp, ID = TRUE, measures = 19, midpoint = NULL)
m2 = trajMeasures(trajdata.noGrp, ID = TRUE, measures = 19, midpoint = 3)

identical(m1$measures, m2$measures)
}

Run the code above in your browser using DataLab