coco_detection_dataset: COCO Detection Dataset

Description

Loads the MS COCO dataset for object detection and segmentation.

Usage

coco_detection_dataset(
  root = tempdir(),
  train = TRUE,
  year = c("2017", "2014"),
  download = FALSE,
  transform = NULL,
  target_transform = NULL
)

Value

An object of class coco_detection_dataset. Each item is a list:

x: a (C, H, W) torch_tensor representing the image.
y$boxes: a (N, 4) torch_tensor of bounding boxes in the format c(x_min, y_min, x_max, y_max).
y$labels: an integer torch_tensor with the class label for each object.
y$area: a float torch_tensor indicating the area of each object.
y$iscrowd: a boolean torch_tensor, where TRUE marks the object as part of a crowd.
y$segmentation: a list of segmentation polygons for each object.
y$masks: a (N, H, W) boolean torch_tensor containing binary segmentation masks.

The returned object has S3 classes "image_with_bounding_box" and "image_with_segmentation_mask"

to enable automatic dispatch by visualization functions such as draw_bounding_boxes() and draw_segmentation_masks().

Arguments

root: Root directory where the dataset is stored or will be downloaded to.
train: Logical. If TRUE, loads the training split; otherwise, loads the validation split.
year: Character. Dataset version year. One of "2014" or "2017".
download: Logical. If TRUE, downloads the dataset if it's not already present in the root directory.
transform: Optional transform function applied to the image.
target_transform: Optional transform function applied to the target (labels, boxes, etc.).

Details

The returned image is in CHW format (channels, height, width), matching the torch convention. The dataset y offers object detection annotations such as bounding boxes, labels, areas, crowd indicators, and segmentation masks from the official COCO annotations.

Examples

Run this code

if (FALSE) {
ds <- coco_detection_dataset(
  train = FALSE,
  year = "2017",
  download = TRUE
)

item <- ds[1]

# Visualize bounding boxes
boxed <- draw_bounding_boxes(item)
tensor_image_browse(boxed)

# Visualize segmentation masks (if present)
masked <- draw_segmentation_masks(item)
tensor_image_browse(masked)
}

Run the code above in your browser using DataLab