Learn R Programming

textTinyR (version 1.0.9)

text_file_parser: text file parser

Description

text file parser

Usage

text_file_parser(input_path_file = NULL, output_path_file = NULL,
  start_query = NULL, end_query = NULL, min_lines = 1,
  trimmed_line = FALSE, verbose = FALSE)

Arguments

input_path_file

a character string specifying the path to the input file

output_path_file

a character string specifying the path to the output file

start_query

a character string. The start_query is the first word of the subset of the data and should appear frequently at the beginning of each line in the text file.

end_query

a character string. The end_query is the last word of the subset of the data and should appear frequently at the end of each line in the text file.

min_lines

a numeric value specifying the minimum number of lines. For instance if min_lines = 2, then only subsets of text with more than 1 lines will be kept.

trimmed_line

either TRUE or FALSE. If FALSE then each line of the text file will be trimmed both sides before applying the start_query and end_query

verbose

either TRUE or FALSE. If TRUE then information will be printed in the console

Details

The text file should have a structure (such as an xml-structure), so that subsets can be extracted using the start_query and end_query parameters.

Examples

Run this code
# NOT RUN {
library(textTinyR)

# fp = text_file_parser(input_path_file = '/folder/input_data.txt',

#                       output_path_file = '/folder/output_data.txt',

#                       start_query = 'word_a', end_query = 'word_w',

#                       min_lines = 1, trimmed_line = FALSE)
# }

Run the code above in your browser using DataLab