Learn R Programming

contentanalysis (version 0.2.1)

merge_text_chunks_named: Merge Text Chunks into Named Sections

Description

Takes a list of markdown text chunks and merges them into named sections. Each section name is extracted from the markdown header (# Title).

Usage

merge_text_chunks_named(
  text_chunks,
  remove_tables = TRUE,
  remove_figure_captions = TRUE
)

Value

A named character vector where:

  • Names are section titles (without the # symbol)

  • Values are complete section contents (including the title line)

Arguments

text_chunks

A list of character strings with markdown text from sequential PDF chunks

remove_tables

Logical. If TRUE, removes all table content including captions. Default is FALSE.

remove_figure_captions

Logical. If TRUE, removes figure captions. Default is FALSE.