read_dir: Read and Merge Files from Directory

Description

Reads data files from any given directory as data frames and merges them into a single data frame (using data.table::rbindlist).

Usage

read_dir(
  pattern = "*[.]",
  path = ".",
  reader_function = data.table::fread,
  ...,
  subdirs = FALSE,
  filt = NULL,
  hush = FALSE
)

Arguments

pattern: Regular expression ("regex"; as string or NULL) for selecting files (passed to the list.files function). The default NULL means that all files at the specified path will be read in. To select, for example, a specific extension like ".txt", the pattern can be given as "\.txt$" (for CSV files, "\.csv$", etc.). Files ending with e.g. "group2.txt" can be specified as "group2\.txt$". Files starting with "exp3" can be specified as "^exp3". Files starting with "exp3" AND ending with ".txt" extension can be specified as "^exp3.*\.txt$". To read in a single file, specify the full filename (e.g. "exp3_subject46_group2.txt"). (See ?regex for more details.)
path: Path to the directory from which the files should be selected and read. The default "." means the current working directory (as returned by getwd()). Either specify correct working directory in advance (see setwd, path_neat), or otherwise enter relative or full paths (e.g. "C:/research" or "/home/projects", etc.).
reader_function: A function to be used for reading the files, data.table::fread by default.
...: Any arguments to be passed on to the chosen reader_function.
subdirs: Logical (FALSE by default). If TRUE, searches files in subdirectories as well (relative to the given path).
filt: An expression to filter, by column values, each data file after it is read and before it is merged with the other data. (The expression should use column names alone; see Examples.)
hush: Logical. If FALSE (default), prints lists all data file names as they are being read (along with related warnings).

Examples

Run this code

# \donttest{

# first, set current working directory
# e.g. to script's path with setwd(path_neat())

# read all text files in currect working directory
merged_df = read_dir("\\.txt$")
# merged_df now has all data

# to use utils::read.table for reading (slower than fread)
# (with some advisable options passed to it)
merged_df = read_dir(
    '\\.txt$',
    reader_function = read.table,
    header = TRUE,
    fill = TRUE,
    quote = "\"",
    stringsAsFactors = FALSE
)
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

See Also

Examples