Provides the `fetch_data()` function, which extracts and structures content from:
Local files (PDF, DOCX, PPTX, TXT, HTML)
Crawled websites (with optional BFS crawl depth)
The returned data frame includes metadata columns like `title`, `author`, `publishedDate`, and the main extracted `content`.
## Required Packages
install.packages(c("pdftools", "officer", "rvest", "xml2", "dplyr", "stringi", "curl", "httr", "jsonlite", "magrittr"))