Creates (registers) a data catalog with the specified name and properties. Catalogs created are visible to all users of the same Amazon Web Services account.
See https://www.paws-r-sdk.com/docs/athena_create_data_catalog/ for full documentation.
athena_create_data_catalog(
Name,
Type,
Description = NULL,
Parameters = NULL,
Tags = NULL
)
[required] The name of the data catalog to create. The catalog name must be unique for the Amazon Web Services account and can use a maximum of 127 alphanumeric, underscore, at sign, or hyphen characters. The remainder of the length constraint of 256 is reserved for use by Athena.
For FEDERATED
type the catalog name has following considerations and
limits:
The catalog name allows special characters such as _ , @ , \ , -
.
These characters are replaced with a hyphen (-) when creating the
CFN Stack Name and with an underscore (_) when creating the Lambda
Function and Glue Connection Name.
The catalog name has a theoretical limit of 128 characters. However,
since we use it to create other resources that allow less characters
and we prepend a prefix to it, the actual catalog name limit for
FEDERATED
catalog is 64 - 23 = 41 characters.
[required] The type of data catalog to create: LAMBDA
for a federated catalog,
GLUE
for an Glue Data Catalog, and HIVE
for an external Apache Hive
metastore. FEDERATED
is a federated catalog for which Athena creates
the connection and the Lambda function for you based on the parameters
that you pass.
A description of the data catalog to be created.
Specifies the Lambda function or functions to use for creating the data catalog. This is a mapping whose values depend on the catalog type.
For the HIVE
data catalog type, use the following syntax. The
metadata-function
parameter is required. The sdk-version
parameter is optional and defaults to the currently supported
version.
metadata-function=lambda_arn, sdk-version=version_number
For the LAMBDA
data catalog type, use one of the following sets of
required parameters, but not both.
If you have one Lambda function that processes metadata and another for reading the actual data, use the following syntax. Both parameters are required.
metadata-function=lambda_arn, record-function=lambda_arn
If you have a composite Lambda function that processes both metadata and data, use the following syntax to specify your Lambda function.
function=lambda_arn
The GLUE
type takes a catalog ID parameter and is required. The
catalog_id
is the account ID of the Amazon Web Services account
to which the Glue Data Catalog belongs.
catalog-id=catalog_id
The GLUE
data catalog type also applies to the default
AwsDataCatalog
that already exists in your account, of which
you can have only one and cannot modify.
The FEDERATED
data catalog type uses one of the following
parameters, but not both. Use connection-arn
for an existing Glue
connection. Use connection-type
and connection-properties
to
specify the configuration setting for a new connection.
connection-arn:<glue_connection_arn_to_reuse>
lambda-role-arn
(optional): The execution role to use for the
Lambda function. If not provided, one is created.
connection-type:MYSQL|REDSHIFT|...., connection-properties:"<json_string>"
For \<json_string\> , use escaped JSON text, as in the following example.
"{\"spill_bucket\":\"my_spill\",\"spill_prefix\":\"athena-spill\",\"host\":\"abc12345.snowflakecomputing.com\",\"port\":\"1234\",\"warehouse\":\"DEV_WH\",\"database\":\"TEST\",\"schema\":\"PUBLIC\",\"SecretArn\":\"arn:aws:secretsmanager:ap-south-1:111122223333:secret:snowflake-XHb67j\"}"
A list of comma separated tags to add to the data catalog that is
created. All the resources that are created by the
create_data_catalog
API operation with
FEDERATED
type will have the tag
federated_athena_datacatalog="true"
. This includes the CFN Stack, Glue
Connection, Athena DataCatalog, and all the resources created as part of
the CFN Stack (Lambda Function, IAM policies/roles).