Learn R Programming

GGIR (version 2.6-0)

g.part1: function to load and pre-process acceleration files

Description

Calls function g.getmeta and g.calibrate, and converts the output to .RData-format which will be the input for g.part2. Here, the function generates a folder structure to keep track of various output files. The reason why these g.part1 and g.part2 are not merged as one generic shell function is because g.part1 takes much longer to and involves only minor decisions of interest to the movement scientist. Function g.part2 on the other hand is relatively fast and comes with all the decisions that directly impact on the variables that are of interest to the movement scientist. Therefore, the user may want to run g.part1 overnight or on a computing cluster, while g.part2 can then be the main playing ground for the movement scientist. Function g.shell.GGIR provides the main shell that allows for operating g.part1 and g.part2.

Usage

g.part1(datadir = c(), outputdir = c(), f0 = 1, f1 = c(),
          studyname = c(), myfun = c(), params_metrics = c(), params_rawdata = c(),
          params_cleaning = c(), params_general = c(), ...)

Arguments

datadir

Directory where the accelerometer files are stored or list of accelerometer filenames and directories

outputdir

Directory where the output needs to be stored. Note that this function will attempt to create folders in this directory and uses those folder to organise output

f0

File index to start with (default = 1). Index refers to the filenames sorted in increasing order

f1

File index to finish with (defaults to number of files available)

studyname

If the datadir is a folder then the study will be given the name of the data directory. If datadir is a list of filenames then the studyname will be used as name for the analysis

myfun

External function object to be applied to raw data. See details applyExtFunction.

params_metrics

See details

params_rawdata

See details

params_cleaning

See details

params_general

See details.

...

If you are working with a non-standard csv formatted files, g.part1 also takes any input arguments needed for function read.myacc.csv and argument rmc.noise from get_nw_clip_block_params. First test these argument with function read.myacc.csv directly. To ensure compatibility with R scripts written for older GGIR versions, the user can also provide parameters listed in the params_ objects as direct argument.

Value

The function provides no values, it only ensures that the output from other functions is stored in .RData(one file per accelerometer file) in folder structure

Details

GGIR comes with many processing parameters, which have been thematically grouped in parameter objects (R list). By running print(load_params()) you can see the default values of all the parameter objects. When g.part 1 is used via g.shell.GGIR you have the option to specifiy a configuration file, which will overrule the default parameter values. Further, as user you can set parameter values as input argument to both g.part1 and g.shell.GGIR. Directly specified argument overrule the configuration file and default values.

See the GGIR package vignette for a more elaborate overview of parameter objects and their usage across GGIR.

GGIR part 1 (g.part1) takes the following parameter objects as input:

params_metrics

A list of parameters used to specify the signal metrics that need to be extract in GGIR part 1.

do.anglex

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.angley

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.anglez

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.zcx

Boolean, if TRUE calculate metric zero-crossing count for x-axis. For computation specifics see source code of function g.applymetrics

do.zcy

Boolean, if TRUE calculate metric zero-crossing count for y-axis. For computation specifics see source code of function g.applymetrics

do.zcz

Boolean, if TRUE calculate metric zero-crossing count for z-axis. For computation specifics see source code of function g.applymetrics

do.enmo

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.lfenmo

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.en

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.mad

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.enmoa

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.roll_med_acc_x

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.roll_med_acc_y

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.roll_med_acc_z

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.dev_roll_med_acc_x

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.dev_roll_med_acc_y

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.dev_roll_med_acc_z

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.bfen

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.hfen

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.hfenplus

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.lfen

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.lfx

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.lfy

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.lfz

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.hfx

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.hfy

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.hfz

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.bfx

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.bfy

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.bfz

Boolean, if TRUE calculate metric. For computation specifics see source code of function g.applymetrics

do.brondcounts

Boolean, if TRUE calculate metric via R package activityCounts. We call them BrondCounts because there are large number of acitivty counts in the physical activity and sleep research field. By calling them Brond Counts we clarify that these are the counts proposed by Jan Brond and implemented in R by Ruben Brondeel. The Brond Counts are intended to be an imitation of one the counts produced by one of the closed source ActiLife software by ActiGraph.

lb

Numeric, lower boundary of the frequency filter (in Hertz) as used in the filter-based metrics.

hb

Numeric, higher boundary of the frequency filter (in Hertz) as used in the filter-based metrics.

n

Numeric, order of the frequency filter as used in a variety of metrics.

params_rawdata

A list of parameters used to related to reading and pre-processing raw data, excluding parameters related to metrics as those are in the params_metrics object.

backup.cal.coef

Character. Default value is "retrieve". Option to use backed-up calibration coefficient instead of deriving the calibration coefficients when analysing the same file twice. Argument backup.cal.coef has two usecase. Use case 1: If the auto-calibration fails then the user has the option to provide back-up calibration coefficients via this argument. The value of the argument needs to be the name and directory of a csv-spreadsheet with the following column names and subsequent values: 'filename' with the names of accelerometer files on which the calibration coefficients need to be applied in case auto-calibration fails; 'scale.x', 'scale.y', and 'scale.z' with the scaling coefficients; 'offset.x', 'offset.y', and 'offset.z' with the offset coefficients, and; 'temperature.offset.x', 'temperature.offset.y', and 'temperature.offset.z' with the temperature offset coefficients. This can be useful for analysing short lasting laboratory experiments with insufficient sphere data to perform the auto-calibration, but for which calibration coefficients can be derived in an alternative way. It is the users responsibility to compile the csv-spreadsheet. Instead of building this file the user can also Use case 2: The user wants to avoid performing the auto-calibration repeatedly on the same file. If backup.cal.coef value is set to "retrieve" (default) then GGIR will look out for the data_quality_report.csv file in the outputfolder QC, which holds the previously generated calibration coefficients. If you do not want this happen, then deleted the data_quality_report.csv from the QC folder or set it to value "redo".

minimumFileSizeMB

Numeric. Minimum File size in MB required to enter processing, default 2MB. This argument can help to avoid having short uninformative files to enter the analyses. Given that a typical accelerometer collects several MBs per hour, the default setting should only skip the very tiny files.

do.cal

Boolean. Whether to apply auto-calibration or not by g.calibrate. Default and recommended setting is TRUE.

imputeTimegaps

Boolean to indicate whether timegaps larger than 1 sample should be imputed. Currently onlly used for .gt3x data and ActiGraph .csv format, where timegaps can be expected as a result of Actigraph's idle sleep.mode configuration that is turned on in some studies.

spherecrit

The minimum required acceleration value (in g) on both sides of 0 g for each axis. Used to judge whether the sphere is sufficiently populated

minloadcrit

The minimum number of hours the code needs to read for the autocalibration procedure to be effective (only sensitive to multitudes of 12 hrs, other values will be ceiled). After loading these hours only extra data is loaded if calibration error has not been reduced to under 0.01 g.

printsummary

Boolean. If TRUE will print a summary when done

chunksize

Numeric. Value between 0.2 and 1 to specificy the size of chunks to be loaded as a fraction of a 12 hour period, e.g. 0.5 equals 6 hour chunks. The default is 1 (12 hrs). For machines with less than 4Gb of RAM memory a value below 1 is recommended.

dynrange

Numeric, provide dynamic range for accelerometer data to overwrite hardcoded 6 g for GENEA and 8 g for other brands

interpolationType

Integer to indicate type of interpolation to be used when resampling time series (mainly relevant for Axivity sensors), 1=linear, 2=nearest neighbour

all arguments that start with "rmc.".

see function read.myacc.csv and get_nw_clip_block_params

params_cleaning

A list of parameters used across all GGIR parts releated to masking or imputing data, abbreviated as 'cleaning'.

do.imp

Boolean. Whether to impute missing values (e.g. suspected of monitor non-wear) or not by g.impute in GGIR part2. Default and recommended setting is TRUE

TimeSegments2ZeroFile

Character. Path to csv-file holding the data.frame used for argument TimeSegments2Zero in function g.impute

data_cleaning_file

Character. Optional path to a csv file you create that holds four columns: ID, day_part5, relyonguider_part4, and night_part4. ID should hold the participant ID. Columns day_part5 and night_part4 allow you to specify which day(s) and night(s) need to be excluded from part 5 and 4, respectively. So, this will be done regardless of whether the rest of GGIR thinks those day(s)/night(s) are valid. Column relyonguider_part4 allows you to specify for which nights part 4 should fully rely on the guider. See also package vignette.

excludefirstlast.part5

Boolean. If TRUE then the first and last window (waking-waking or midnight-midnight) are ignored in part 5.

excludefirstlast

Boolean. If TRUE then the first and last night of the measurement are ignored for the sleep assessment (part 4).

excludefirst.part4

Boolean. If TRUE then the first night of the measurement are ignored for the sleep assessment (part 4.

excludelast.part4

Boolean. If TRUE then the last night of the measurement are ignored for the sleep assessment.

includenightcrit

Numeric. Minimum number of valid hours per night (24 hour window between noon and noon), used for sleep assessment (part 4).

minimum_MM_length.part5

Numeric. Minimum length in hours of a MM day to be included in the cleaned part 5 results.

selectdaysfile

Character, Functionality designed for the London Centre of Longidutinal studies. Csv file holding the relation between device serial numbers and measurement days of interest.

strategy

Numeric, how to deal with knowledge about study protocol. value = 1 means select data based on hrs.del.start and hrs.del.end. Value = 2 makes that only the data between the first midnight and the last midnight is used for imputation. Value = 3 only selects the most active X days in the file where X is specified by argument ndayswindow. Value = 4 to only use the data after the first midnight. Used in GGIR part 2

hrs.del.start

Numeric, how many HOURS after start of experiment did wearing of monitor start? Used in GGIR part 2

hrs.del.end

Numeric, how many HOURS before the end of the experiment did wearing of monitor definitely end? Used in GGIR part 2

maxdur

Numeric, How many DAYS after start of experiment did experiment definitely stop? (set to zero if unknown = default). Used in GGIR part2

ndayswindow

Numeric, If strategy is set to 3 then this is the size of the window as a number of days. Used in GGIR part2

includedaycrit.part5

Numeric. see g.report.part5

includedaycrit

Numeric, minimum required number of valid hours in day specific analysis (NOTE: there is no minimum required number of hours per day in the summary of an entire measurement, every available hour is used to make the best possible inference on average metric value per average day)

max_calendar_days

Numeric, the maximum number of calendar days to include

params_general

A list of parameters used across all GGIR parts that do not fall in any of the other categories.

overwrite

Boolean. Do you want to overwrite analysis for which milestone data exists? If overwrite=FALSE then milestone data from a previous analysis will be used if available and visual reports will not be created again.

selectdaysfile

Character. Do not use, this is legacy code for one specific data study. Character pointing at a csv file holding the relationship between device serial numbers (first column) and measurement dates of interest (second and third column). The date format should be dd/mm/yyyy. And the first row if the csv file is assumed to have a character variable names, e.g. "serialnumber" "Day1" and "Day2" respectively. Raw data will be extracted and stored in the output directory in a new subfolder named 'raw'.

dayborder

Numeric. Hour at which days start and end (default = 0), value = 4 would mean 4 am

do.parallel

Boolean. whether to use multi-core processing (only works if at least 4 CPU cores are available).

maxNcores

Numeric. Maximum number of cores to use when argument do.parallel is set to true. GGIR by default uses the maximum number of available cores, but this argument allows you to set a lower maximum.

acc.metric

Boolean. Which one of the metrics do you want to consider to analyze L5. The metric of interest need to be calculated in M.

part5_agg2_60seconds

Boolean. Wether to use aggregate epochs to 60 seconds as part of the part 5 analysis.

print.filename

Boolean. Whether to print the filename before before analysing it (default is FALSE). Printing the filename can be useful to investigate problems (e.g. to verify that which file is being read).

desiredtz

Character, desired timezone: see also http://en.wikipedia.org/wiki/Zone.tab

configtz

Character, Only functional for AX3 cwa data at the moment. Timezone in which the accelerometer was configured. Only use this argument if the timezone of configuration and timezone in which recording took place are different.

sensor.location

Character, see g.sib.det

acc.metric

Character, see g.sib.det

windowsizes

Numeric vector, three values to indicate the lengths of the windows as in c(window1,window2,window3): window1 is the short epoch length in seconds and by default 5 this is the time window over which acceleration and angle metrics are calculated, window2 is the long epoch length in seconds for which non-wear and signal clipping are defined, default 900. However, window3 is the window length of data used for non-wear detection and by default 3600 seconds. So, when window3 is larger than window2 we use overlapping windows, while if window2 equals window3 non-wear periods are assessed by non-overlapping windows. Window2 is expected to be a multitude of 60 seconds.

idloc

Numeric. If idloc = 1 (default) the code assumes that ID number is stored in the obvious header field. Note that for ActiGraph data the ID never stored in the file header. For value set to 2, 5, 6, and 7, GGIR looks at the filename and extracts the character string preceding the first occurance of a '_', ' ' (space), '.' (dot), and '-', respecitvely. You may have noticed that idloc 3 and 4 are skipped, they were used for one study in 2012, and not actively maintained anymore, but because it is legacy code not omitted.

References

  • van Hees VT, Gorzelniak L, Dean Leon EC, Eder M, Pias M, et al. (2013) Separating Movement and Gravity Components in an Acceleration Signal and Implications for the Assessment of Human Daily Physical Activity. PLoS ONE 8(4): e61691. doi:10.1371/journal.pone.0061691

  • van Hees VT, Fang Z, Langford J, Assah F, Mohammad A, da Silva IC, Trenell MI, White T, Wareham NJ, Brage S. Auto-calibration of accelerometer data for free-living physical activity assessment using local gravity and temperature: an evaluation on four continents. J Appl Physiol (1985). 2014 Aug 7

  • Aittasalo M, Vaha-Ypya H, Vasankari T, Husu P, Jussila AM, and Sievanen H. Mean amplitude deviation calculated from raw acceleration data: a novel method for classifying the intensity of adolescents physical activity irrespective of accelerometer brand. BMC Sports Science, Medicine and Rehabilitation (2015).

Examples

Run this code
# NOT RUN {
  
# }
# NOT RUN {
    datafile = "C:/myfolder/mydata"
    outputdir = "C:/myresults"
    g.part1(datadir,outputdir)
  
# }

Run the code above in your browser using DataLab