rf100_document_collection: RF100 Document Collection Datasets

Description

RoboFlow 100 Document dataset Collection

Usage

rf100_document_collection(
  dataset,
  split = c("train", "test", "valid"),
  transform = NULL,
  target_transform = NULL,
  download = FALSE
)

Value

A torch dataset. Each element is a named list with:

x: H x W x 3 array representing the image.
y: a list containing the target with:
- image_id: numeric identifier of the x image.
- labels: numeric identifier of the N bounding-box object class.
- boxes: a torch_tensor of shape (N, 4) with bounding boxes, each in \((x_{min}, y_{min}, x_{max}, y_{max})\) format.

The returned item inherits the class image_with_bounding_box so it can be visualised with helper functions such as draw_bounding_boxes().

Arguments

dataset: Dataset to select within c("tweeter_post", "tweeter_profile", "document_part", "activity_diagram", "signature", "paper_part", "tabular_data", "paragraph").
split: the subset of the dataset to choose between c("train", "test", "valid").
transform: Optional transform function applied to the image.
target_transform: Optional transform function applied to the target.
download: Logical. If TRUE, downloads the dataset if not present at root.

Details

Loads one of the RoboFlow 100 Document datasets with COCO-style bounding box annotations for object detection tasks.

Examples

Run this code

if (FALSE) {
ds <- rf100_document_collection(
  dataset = "tweeter_post",
  split = "train",
  transform = transform_to_tensor,
  download = TRUE
)

# Retrieve a sample and inspect annotations
item <- ds[1]
item$y$labels
item$y$boxes

# Draw bounding boxes and display the image
boxed_img <- draw_bounding_boxes(item)
tensor_image_browse(boxed_img)
}