split_into_sections

Splits extracted text into logical sections (Introduction, Methods, Results, etc.)
using either the PDF's table of contents or common academic section patterns.

Provides comprehensive tools for extracting and analyzing scientific
content from PDF documents, including citation extraction, reference matching,
text analysis, and bibliometric indicators. Supports multi-column PDF layouts,
'CrossRef' API <https://www.crossref.org/documentation/retrieve-metadata/rest-api/> integration, and advanced citation parsing.

Massimo Aria

contentanalysis

Scientific Content and Citation Analysis from PDF Documents

Corrado Cuccurullo

split_into_sections function

<dl><dt>text</dt>
<dd>Character string. Full text of the document.</dd>
<dt>file_path</dt>
<dd>Character string or NULL. Path to PDF file for TOC extraction.
If NULL, uses common section names. Default is NULL.</dd></dl>

Arguments

Split document text into sections — split_into_sections

<dl>

<dt>text</dt>
<dd>Character string. Full text of the document.</dd>


<dt>file_path</dt>
<dd>Character string or NULL. Path to PDF file for TOC extraction.
If NULL, uses common section names. Default is NULL.</dd>

</dl>

split_into_sections: Split document text into sections

Description

Usage

Value

Arguments

Details

Examples