SetTarget: Set the target variable (and by default, start the DataRobot Autopilot)

Description

This function sets the target variable for the project defined by project, starting the process of building models to predict the response variable target. Both of these parameters - project and target - are required and they are sufficient to start a modeling project with DataRobot default specifications for the other 10 optional parameters.

Usage

SetTarget(project, target, metric = NULL, weights = NULL,
  partition = NULL, mode = NULL, seed = NULL, targetType = NULL,
  positiveClass = NULL, blueprintThreshold = NULL, responseCap = NULL,
  quickrun = NULL, featurelistId = NULL, smartDownsampled = NULL,
  majorityDownsamplingRate = NULL, scaleoutModelingMode = NULL,
  accuracyOptimizedBlueprints = NULL, offset = NULL, exposure = NULL,
  eventsCount = NULL, maxWait = 600)

Arguments

project

character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier.

target

character. String giving the name of the response variable to be predicted by all project models.

metric

character. Optional. String specifying the model fitting metric to be optimized; a list of valid options for this parameter, which depends on both project and target, may be obtained with the function GetValidMetrics.

weights

character. Optional. String specifying the name of the column from the modeling dataset to be used as weights in model fitting.

partition

partition. Optional. S3 object of class 'partition' whose elements specify a valid partitioning scheme. See help for functions CreateGroupPartition, CreateRandomPartition, CreateStratifiedPartition, CreateUserPartition and CreateDatetimePartitionSpecification.

mode

character. Optional. Specifies the autopilot mode used to start the modeling project; valid options are 'auto' (fully automatic, the current DataRobot default, obtained when mode = NULL), 'manual' and 'quick'

seed

integer. Optional. Seed for the random number generator used in creating random partitions for model fitting.

targetType

character. Optional. Used to specify the targetType to use for a project when it is ambiguous, i.e. a numeric target with a few unique values that could be used for either regression or multiclass. Valid options are 'Binary', 'Multiclass', 'Regression'. See TargetType for an easier way to keep track of the options.

positiveClass

character. Optional. Target variable value corresponding to a positive response in binary classification problems.

blueprintThreshold

integer. Optional. The maximum time (in hours) that any modeling blueprint is allowed to run before being excluded from subsequent autopilot stages.

responseCap

numeric. Optional. Floating point value, between 0.5 and 1.0, specifying a capping limit for the response variable. The default value NULL corresponds to an uncapped response, equivalent to responseCap = 1.0.

quickrun

logical. Optional. if TRUE then DR will perform a quickrun, limiting the number of models evaluated during autopilot. (quickrun flag is deprecated in 2.4, will be removed in 2.10)

featurelistId

numeric. Specifies which feature list to use. If NULL (default), a default featurelist is used.

smartDownsampled

logical. Optional. Whether to use smart downsampling to throw away excess rows of the majority class. Only applicable to classification and zero-boosted regression projects.

majorityDownsamplingRate

numeric. Optional. Floating point value, between 0.0 and 100.0. The percentage of the majority rows that should be kept. Specify only if using smart downsampling. May not cause the majority class to become smaller than the minority class.

scaleoutModelingMode

character. Optional. Specifies the behavior of Scaleout models for the project. Possible options are in ScaleoutModelingMode.

ScaleoutModelingMode$Disabled will prevent scaleout models from running during autopilot and will prevent Scaleout models from showing up in blueprints.
ScaleoutModelingMode$RepositoryOnly will prevent scaleout models from running during autopilot, but will make them available in blueprints to run manually.
ScaleoutModelingMode$Autopilot will run scaleout models during autopilot and will make them available in blueprints.

Note that scaleout models are only supported in the Hadoop environment with the correct corresponding user permission set.

accuracyOptimizedBlueprints

logical. Optional. When enabled, accuracy optimized blueprints will run in autopilot for the project. These are longer-running model blueprints that provide increased accuracy over normal blueprints that run during autopilot.

offset

character. Optional. Vector of the names of the columns containing the offset of each row.

exposure

character. Optional. The name of a column containing the exposure of each row.

eventsCount

character. Optional. The name of a column specifying the events count.

maxWait

integer. Specifies how many seconds to wait for the server to finish analyzing the target and begin the modeling process. If the process takes longer than this parameter specifies, execution will stop (but the server will continue to process the request).

Examples

Run this code

# NOT RUN {
  projectId <- "59a5af20c80891534e3c2bde"
  SetTarget(projectId, "targetFeature")
  SetTarget(projectId, "targetFeature", metric = "LogLoss")
  SetTarget(projectId, "targetFeature", mode = AutopilotMode$Manual)
  SetTarget(projectId, "targetFeature", targetType = TargetType$Multiclass)
# }

Run the code above in your browser using DataLab