OpenML (version 1.10)

listOMLDataSets: List the first 5000 OpenML data sets.

Description

The returned data.frame contains the data set id “data.id”, the “status” (“active”, “deactivated”, “in_preparation”) and describing data qualities. Note that by default only the first 5000 data sets will be returned (due to the argument “limit = 5000”).

Usage

listOMLDataSets(number.of.instances = NULL, number.of.features = NULL,
  number.of.classes = NULL, number.of.missing.values = NULL,
  tag = NULL, data.name = NULL, limit = 5000, offset = NULL,
  status = "active", verbosity = NULL)

Arguments

number.of.instances

[numeric(1) | numeric(2)] If not NULL, subsets the entries with respect to the given values or, if a vector of length 2 is passed, the given ranges.

number.of.features

[numeric(1) | numeric(2)] If not NULL, it subsets the entries with respect to the given values or, if a vector of length 2 is passed, the given range.

number.of.classes

[numeric(1) | numeric(2)] If not NULL, subsets the entries with respect to the given values or, if a vector of length 2 is passed, the given ranges.

number.of.missing.values

[numeric(1) | numeric(2)] If not NULL, subsets the entries with respect to the given values or, if a vector of length 2 is passed, the given ranges.

tag

[character] If not NULL only entries with the corresponding tags are listed.

data.name

[character(1)] Name of the data set.

limit

[numeric(1)] Optional. The maximum number of entries to return. Without specifying offset, it returns the first 'limit' entries. Setting limit = NULL returns all available entries.

offset

[numeric(1)] Optional. The offset to start from. Should be indices starting from 0, which do not refer to IDs. Is ignored when no limit is given.

status

[character] Subsets the results according to the status. Possible values are {"active", "deactivated", "in_preparation", "all"}. Default is "active".

verbosity

[integer(1)] Print verbose output on console? Possible values are: 0: normal output, 1: info output, 2: debug output. Default is set via setOMLConfig.

Value

[data.frame].

See Also

Other listing functions: chunkOMLlist, listOMLDataSetQualities, listOMLEstimationProcedures, listOMLEvaluationMeasures, listOMLFlows, listOMLRuns, listOMLSetup, listOMLStudies, listOMLTaskTypes, listOMLTasks

Other data set-related functions: OMLDataSetDescription, OMLDataSet, convertMlrTaskToOMLDataSet, convertOMLDataSetToMlr, deleteOMLObject, getOMLDataSet, tagOMLObject, uploadOMLDataSet

Examples

Run this code
# NOT RUN {
# \dontrun{
# 	datasets = listOMLDataSets()
# 	tail(datasets)
# }
# }

Run the code above in your browser using DataCamp Workspace