Creates a DataSource object. A DataSource references data that can be used to perform create_ml_model, create_evaluation, or create_batch_prediction operations.
See https://www.paws-r-sdk.com/docs/machinelearning_create_data_source_from_s3/ for full documentation.
machinelearning_create_data_source_from_s3(
DataSourceId,
DataSourceName = NULL,
DataSpec,
ComputeStatistics = NULL
)[required] A user-supplied identifier that uniquely identifies the DataSource.
A user-supplied name or description of the DataSource.
[required] The data specification of a DataSource:
DataLocationS3 - The Amazon S3 location of the observation data.
DataSchemaLocationS3 - The Amazon S3 location of the DataSchema.
DataSchema - A JSON string representing the schema. This is not
required if DataSchemaUri is specified.
DataRearrangement - A JSON string that represents the splitting and
rearrangement requirements for the Datasource.
Sample -
"{\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"
The compute statistics for a DataSource. The statistics are generated
from the observation data referenced by a DataSource. Amazon ML uses
the statistics internally during MLModel training. This parameter must
be set to true if the DataSource needs to be used for MLModel
training.