Last chance! 50% off unlimited learning
Sale ends in
DBpedia ontology dataset classification dataset. It contains 560,000 training samples and 70,000 testing samples for each of 14 nonoverlapping classes from DBpedia.
dataset_dbpedia(
dir = NULL,
split = c("train", "test"),
delete = FALSE,
return_path = FALSE,
clean = FALSE,
manual_download = FALSE
)
A tibble with 560,000 or 70,000 rows for "train" and "test" respectively and 3 variables:
Character, denoting the class class
Character, title of article
Character, description of article
Character, path to directory where data will be stored. If
NULL
, user_cache_dir will be used to determine path.
Character. Return training ("train") data or testing ("test") data. Defaults to "train".
Logical, set TRUE
to delete dataset.
Logical, set TRUE
to return the path of the dataset.
Logical, set TRUE
to remove intermediate files. This can
greatly reduce the size. Defaults to FALSE.
Logical, set TRUE
if you have manually
downloaded the file and placed it in the folder designated by running
this function with return_path = TRUE
.
The classes are
Company
EducationalInstitution
Artist
Athlete
OfficeHolder
MeanOfTransportation
Building
NaturalPlace
Village
Animal
Plant
Album
Film
WrittenWork
Other topic:
dataset_ag_news()
,
dataset_trec()
if (FALSE) {
dataset_dbpedia()
# Custom directory
dataset_dbpedia(dir = "data/")
# Deleting dataset
dataset_dbpedia(delete = TRUE)
# Returning filepath of data
dataset_dbpedia(return_path = TRUE)
# Access both training and testing dataset
train <- dataset_dbpedia(split = "train")
test <- dataset_dbpedia(split = "test")
}
Run the code above in your browser using DataLab