Learn R Programming

stepmetrics (version 1.0.3)

readFile: Read and standardize minute-level step data for one participant

Description

Reads one or more files for a single participant and returns a clean, minute-level data frame with two columns: `timestamp` and `steps`. The function auto-detects common file formats and timestamp layouts, fixes ActiGraph CSV headers/metadata when present, and aggregates to a 60-second epoch if input data are recorded at sub-minute resolution.

**Supported input formats**

  • CSV: Generic CSVs and ActiGraph exports (header lines and delimiters auto-detected; handles date/time split columns).

  • AGD: ActiGraph binary files via PhysicalActivity.

  • RData: GGIR output (IMP$metashort).

Usage

readFile(path, time_format = c(), tz = "")

Value

A data.frame with two columns:

timestamp

Character vector of ISO-8601 datetimes ("YYYY-MM-DDTHH:MM:SS%z") for CSV/AGD inputs. For GGIR RData inputs, timestamps are carried through as present in IMP$metashort (which may not include an offset).

steps

Numeric vector of steps per minute. If the source data have sub-minute epochs, values are summed to 60-second bins. Epochs longer than 60 seconds are not supported and trigger an error.

Arguments

path

Character vector. Path(s) to the file(s) containing timestamp and step data for one participant. When multiple files are provided, they are concatenated in the order given.

time_format

Character (optional). Explicit timestamp format string (as used by strptime) to override auto-detection for CSV inputs. If omitted, a set of common formats is tried automatically. The time zone is controlled by tz.

tz

Character (optional). Time zone in which to interpret and emit timestamps for CSV/AGD inputs (e.g., "Europe/Madrid"). The default "" uses the current R session time zone (Sys.timezone() / Sys.getenv("TZ")). The local clock time in the data is preserved; the returned `timestamp` strings include an explicit ISO-8601 offset (%z). Ignored for GGIR RData inputs (timestamps are carried through as stored).

Time zones

CSV / AGD inputs:

Timestamps are parsed in tz (default: session time zone) and emitted as ISO-8601 with an explicit offset. This preserves the local clock time. Running the same code on machines with different session time zones may change the offset but not the clock time if you pass a fixed tz.

GGIR RData inputs:

Timestamps are returned as stored in IMP$metashort; no conversion is performed.

Details

  • CSV handling: Detects and skips ActiGraph header lines (typically 10), infers the field separator (comma/semicolon), and reconstructs a single timestamp when date and time are stored in separate columns. If no explicit timestamp column exists (rare ActiGraph cases), a timestamp sequence is reconstructed from the file metadata (start time + epoch).

  • AGD handling: Reads via readActigraph. The recording start time and epoch length are obtained from the embedded database and used to build a regular timestamp sequence, interpreted in tz.

  • Step column detection: The step-count column is inferred by matching names containing "step" or "value"; if multiple candidates are present, the column with higher variability is chosen.

  • Epoch standardization: If the input epoch is shorter than 60 seconds, rows are aggregated by summing steps to 1-minute bins. Epochs longer than 60 seconds are currently unsupported and result in an error.

See Also

step.metrics, get_cadence_bands, readActigraph

Examples

Run this code
# \donttest{
# Fitbit CSV (auto-detect format)
fitbit_csv <- system.file("extdata", "testfiles_fitbit",
                          "S001_d1_1min_epoch.csv", package = "stepmetrics")
df1 <- readFile(fitbit_csv)

# ActiGraph AGD (explicitly pin time zone for reproducibility)
agd <- system.file("extdata", "testfiles_agd", "3h30sec.agd", package = "stepmetrics")
df2 <- readFile(agd, tz = "Europe/Madrid")
# }

Run the code above in your browser using DataLab