sld: Simple Longitudinal Difference (SLD)

Description

This function detects influential subjects in a longitudinal dataset by analyzing their successive differences. It calculates the successive differences for each subject, determines a threshold using the mean and standard deviation, and identifies subjects whose maximum successive difference exceeds this threshold. This approach helps in detecting abrupt changes in subject responses over time.

Usage

sld(data, subject_id, time, response, k = 2, verbose = FALSE)

Value

A list containing:

influential_subjects: A vector of subject IDs identified as influential.
influential_data: A data frame containing data for influential subjects.
non_influential_data: A data frame containing data for non-influential subjects.
successive_difference_plot: A ggplot object visualizing maximum successive differences across subjects.
longitudinal_plot: A ggplot object displaying longitudinal data with influential subjects highlighted.
IS_table: A data frame containing the Influence Score (IS) and the Partial Influence Score (PIS) values for each subject at each time point.

Arguments

data: A data frame containing longitudinal data.
subject_id: A column specifying the column name for subject IDs.
time: A column specifying different time points that observations are measured.
response: A column specifying the column name for the response variable.
k: A numeric value for the threshold parameter (default is 2), representing the number of standard deviations used to define the threshold.
verbose: Logical; if TRUE, prints informative messages during execution.

Details

The function follows these steps:

Computes successive differences for each subject.
Calculates the mean and standard deviation of these differences across all subjects.
Defines a threshold as k standard deviations from the mean.
Identifies subjects whose maximum successive difference exceeds this threshold.
Separates data into influential and non-influential subjects.
Visualizes the results using ggplot2.

This method is useful for identifying subjects with sudden changes in their response patterns over time.

Examples

Run this code

data(infsdata)
infsdata <- infsdata[1:5,]
result <- sld(infsdata, "subject_id", "time", "response", k = 2)
print(result$influential_subjects)
head(result$influential_data)
head(result$non_influential_data)