Learn R Programming

RAGFlowChainR (version 0.1.1)

data_fetcher: data_fetcher.R Overview

Description

Provides the `fetch_data()` function, which extracts and structures content from:

  • Local files (PDF, DOCX, PPTX, TXT, HTML)

  • Crawled websites (with optional BFS crawl depth)

The returned data frame includes metadata columns like `title`, `author`, `publishedDate`, and the main extracted `content`.

## Required Packages install.packages(c("pdftools", "officer", "rvest", "xml2", "dplyr", "stringi", "curl", "httr", "jsonlite", "magrittr"))

Arguments