Skip to main content

S3

This document describes the parameters required for accessing AWS S3. These parameters apply to:

  • Catalog properties.
  • Table Valued Function properties.
  • Broker Load properties.
  • Export properties.
  • Outfile properties.

Parameter Overview

Property NameLegacy NameDescriptionDefaultRequired
s3.endpointS3 service access endpoint, e.g., s3.us-east-1.amazonaws.comNoneNo
s3.access_keyAWS Access Key for authenticationNoneNo
s3.secret_keyAWS Secret Key for authenticationNoneNo
s3.regionS3 region, e.g., us-east-1. Strongly recommendedNoneYes
s3.use_path_styleWhether to use path-style accessFALSENo
s3.connection.maximumMaximum number of connections for high concurrency scenarios50No
s3.connection.request.timeoutRequest timeout (milliseconds), controls connection acquisition timeout3000No
s3.connection.timeoutConnection establishment timeout (milliseconds)1000No
s3.role_arnRole ARN specified when using Assume Role modeNoneNo
s3.external_idExternal ID used with s3.role_arnNoneNo
s3.credentials_provider_typeCredentials provider type for AWS authentication (used without AK/SK; used as STS source credentials in IAM Role mode)DEFAULTNo

Version note: s3.credentials_provider_type is supported since 3.1.4 and 4.0.3.

Authentication Configuration

Doris supports the following three methods to access S3:

1. Direct Access Key and Secret Key (AK/SK)

"s3.access_key"="your-access-key",
"s3.secret_key"="your-secret-key",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"

2. IAM Role (Assume Role) mode

Suitable for cross-account and temporary authorization access. Doris automatically obtains temporary credentials through role authorization.

"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.external_id"="external-identifier",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"

Configure s3.credentials_provider_type in IAM Role mode

When s3.role_arn is configured, s3.credentials_provider_type controls which source credentials provider is used for STS AssumeRole:

  1. Get source credentials from s3.credentials_provider_type.
  2. Call STS AssumeRole with source credentials.
  3. Access S3 with the returned temporary credentials.

IAM Role + s3.credentials_provider_type examples

Example 1: EC2 Instance Profile as STS source credentials

"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.external_id"="external-identifier",
"s3.credentials_provider_type"="INSTANCE_PROFILE",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"

Example 2: Web Identity (for example IRSA) as STS source credentials

"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.credentials_provider_type"="WEB_IDENTITY",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"

Example 3: Container metadata as STS source credentials

"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.credentials_provider_type"="CONTAINER",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"

Example 4: Default provider chain as STS source credentials

"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.credentials_provider_type"="DEFAULT",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"

3. Specify credential source with s3.credentials_provider_type

This is suitable for scenarios without explicit AK/SK, such as EC2 Instance Profile, container metadata, or Web Identity.

"s3.credentials_provider_type"="INSTANCE_PROFILE",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"

Supported values for s3.credentials_provider_type

ValueDescription
DEFAULTUse default provider chain
ENVRead credentials from environment variables
SYSTEM_PROPERTIESRead credentials from system properties
WEB_IDENTITYUse Web Identity Token credentials
CONTAINERUse container metadata credentials
INSTANCE_PROFILEUse EC2 Instance Profile credentials
ANONYMOUSAnonymous access (for public buckets)

Effective rules when configured together

  1. If s3.access_key and s3.secret_key are both configured, AK/SK is used first.
  2. If AK/SK is not configured and s3.role_arn is configured, IAM Role is used. In this case, s3.credentials_provider_type is used to select STS source credentials.
  3. If neither AK/SK nor s3.role_arn is configured, s3.credentials_provider_type directly determines the credentials provider used by the S3 client.

Note: s3.access_key and s3.secret_key must be configured together.

For instructions on AWS authentication and authorization configuration, please refer to the document aws-authentication-and-authorization

Accessing S3 Directory Bucket

This feature is supported since version 3.1.0.

Amazon S3 Express One Zone (also known as Directory Bucket) provides higher performance, but has a different endpoint format.

  • Regular bucket: s3.us-east-1.amazonaws.com
  • Directory Bucket: s3express-usw2-az1.us-west-2.amazonaws.com

For more available regions, refer to: AWS Official Documentation

Example:

"s3.access_key"="ak",
"s3.secret_key"="sk",
"s3.endpoint"="s3express-usw2-az1.us-west-2.amazonaws.com",
"s3.region"="us-west-2"

Permission Policies

Depending on the use case, permissions can be categorized into read-only and read-write policies.

1. Read-only Permissions

Only allows reading objects from S3. Suitable for LOAD, TVF, querying EXTERNAL CATALOG, and other scenarios.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
],
"Resource": "arn:aws:s3:::<your-bucket>/your-prefix/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": "arn:aws:s3:::<your-bucket>"
}
]
}

2. Read-write Permissions

Based on read-only permissions, additionally allows deleting, creating, and modifying objects. Suitable for EXPORT, OUTFILE, and EXTERNAL CATALOG write-back scenarios.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:GetObjectVersion",
"s3:DeleteObject",
"s3:DeleteObjectVersion",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": "arn:aws:s3:::<your-bucket>/<your-prefix>/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:GetBucketVersioning",
"s3:GetLifecycleConfiguration"
],
"Resource": "arn:aws:s3:::<your-bucket>"
}
]
}

Notes

  1. Placeholder Replacement

    • <bucket> → Your S3 Bucket name.
    • <account-id> → Your AWS account ID (12-digit number).
  2. Principle of Least Privilege

    • If only querying, do not grant write permissions.