Learn R Programming

longitudinalData (version 2.0)

generateArtificialLongData: ~ Function: generateArtificialLongData (or gald) ~

Description

This function builp up an artificial longitudinal data set (single variable-trajectory) an turn it into an object of class LongData.

Usage

gald(nbEachClusters=50,time=0:10,varNames="V",
    functionClusters=list(function(t){0},function(t){t},function(t){10-t},function(t){-0.4*t^2+4*t}),
    constantPersonal=function(t){rnorm(1,0,2)},
    functionNoise=function(t){rnorm(1,0,2)},
    decimal=2,percentOfMissing=0)


generateArtificialLongData(nbEachClusters=50,time=0:10,varNames="V",
    functionClusters=list(function(t){0},function(t){t},function(t){10-t},function(t){-0.4*t^2+4*t}),
    constantPersonal=function(t){rnorm(1,0,2)},
    functionNoise=function(t){rnorm(1,0,2)},
    decimal=2,percentOfMissing=0)

Arguments

nbEachClusters
[numeric] or [vector(numeric)]: number of trajectories that each cluster must contain. If a single number is given, it is duplicated for all groups.
time
[vector(numeric)]: time at which measures are made.
varNames
[character]: name of the variable.
functionClusters
[list(function)]: lists the functions defining the average trajectories of each cluster.
constantPersonal
[function] or [list(function)]: lists the functions defining the personnal variation between an individual and the mean trajectories of its cluster. Note that these function should be constant function (the personal variation can not evolve
functionNoise
[function] or [list(function)]: lists the functions generating the noise of each trajectory within its own cluster. If a single function is given, it is duplicated for all groups (see detail).
decimal
[numeric]: number of decimals used to round up values.
percentOfMissing
[numeric]: percentage (between 0 and 1) of missing data generated in each cluster. If a single value is given, it is duplicated for all groups. The missing values are Missing Completly At Random (MCAR).

Value

  • An object of class LongData.

Author

Christophe Genolini 1. UMR U1027, INSERM, Universit� Paul Sabatier / Toulouse III / France 2. CeRSME, EA 2931, UFR STAPS, Universit� de Paris Ouest-Nanterre-La D�fense / Nanterre / France

Details

generateArtificialLongData (gald in short) is a function that contruct a set of artificial longitudinal data. Each individual is considered as belonging to a group. This group follows a theoretical trajectory, function of time. These functions (one per group) are given via the argument functionClusters. Even if it belong to a clusers, individual does not perfectly follow the mean trajectories. So a personal variation is added via the argument constantPersonal. This personal variation is constant over time. Then some residual noise is added to all the trajectories via the argument functionNoise. The number of individuals in each group is given by nbEachClusters. Finally, it is possible to add missing values randomly (MCAR) striking the data thanks to percentOfMissing.

References

[1] C. Genolini and B. Falissard "KmL: k-means for longitudinal data" Computational Statistics, vol 25(2), pp 317-328, 2010 [2] C. Genolini and B. Falissard "KmL: A package to cluster longitudinal data" Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011

See Also

LongData, longData, generateArtificialLongData3d

Examples

Run this code
par(ask=TRUE)


#####################
### Default example

(ex1 <- generateArtificialLongData())
plot(ex1)
part1 <- partition(rep(1:4,each=50))
plot(ex1,part1)


#####################
### Three diverging lines

ex2 <- generateArtificialLongData(functionClusters=list(function(t)0,function(t)-t,function(t)t))
part2 <- partition(rep(1:3,each=50))
plot(ex2,part2)


#####################
### Three diverging lines with high variance, unbalance groups and missing value

ex3 <- generateArtificialLongData(
   functionClusters=list(function(t)0,function(t)-t,function(t)t),
   nbEachClusters=c(100,30,10),
   functionNoise=function(t){rnorm(1,0,3)},
   percentOfMissing=c(0.25,0.5,0.25)
)
part3 <- partition(rep(1:3,c(100,30,10)))
plot(ex3,part3)


#####################
### Four strange functions

ex4 <- generateArtificialLongData(
    nbEachClusters=c(300,200,100,100),
    functionClusters=list(function(t){-10+2*t},function(t){-0.6*t^2+6*t-7.5},function(t){10*sin(t)},function(t){30*dnorm(t,2,1.5)}),
    functionNoise=function(t){rnorm(1,0,3)},
    time=0:10,decimal=2,percentOfMissing=0.3)
part4 <- partition(rep(1:4,c(300,200,100,100)))
plot(ex4,part4)


#####################
### To get only longData (if you want some artificial longData
###    to deal with another algorithm), use the getteur ["traj"]

ex5 <- gald(nbEachCluster=3,time=1:3)
ex5["traj"]

par(ask=FALSE)

Run the code above in your browser using DataLab