daiR (version 1.0.0)

merge_shards: Merge shards

Description

Merges text files from Document AI output shards into a single text file corresponding to the parent document.

Usage

merge_shards(source_dir = getwd(), dest_dir = getwd())

Value

no return value, called for side effects

Arguments

source_dir

folder path for input files

dest_dir

folder path for output files

Details

The function works on .txt files generated from .json output files, not on .json files directly. It also presupposes that the .txt filenames have the same name stems as the .json files from which they were extracted. For the v1 API, this means files ending with "-0.txt", "-1.txt", "-2.txt", and so forth. The safest approach is to generate .txt files using get_text() with the save_to_file parameter set to TRUE.

Examples

Run this code
if (FALSE) {
merge_shards()

merge_shards(tempdir(), getwd())
}

Run the code above in your browser using DataLab