S3
This document describes the parameters required for accessing AWS S3. These parameters apply to:
- Catalog properties.
- Table Valued Function properties.
- Broker Load properties.
- Export properties.
- Outfile properties.
Configure BE CA Certificate for HTTPS
Starting from Doris 2.1, you can explicitly configure ca_cert_file_paths in be.conf when Doris BE accesses S3 over HTTPS.
By default, if ca_cert_file_paths is not configured, Doris uses the operating system's default CA certificates. In most environments, you do not need to set this parameter manually. Configure it in the following cases:
- The BE node is missing system CA certificates, or the installed CA bundle is too old.
- The BE node runs in a minimal container or image that does not include the
ca-certificatespackage. - The default CA file path on the BE node is invalid, or the Doris process does not have read permission on the CA file.
- Your environment uses a self-signed certificate, a private CA, or a corporate proxy or gateway that re-signs TLS traffic.
- You encounter errors such as
Problem with the SSL CA certorcurl 77: Problem with the SSL CA cert (path? access rights?)when accessing S3.
Example:
# be.conf
ca_cert_file_paths = /etc/ssl/certs/ca-certificates.crt
Common CA bundle paths:
- Debian / Ubuntu:
/etc/ssl/certs/ca-certificates.crt - CentOS / RockyLinux:
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt
Configure this item on every BE node that may access S3, and ensure that the certificate file exists and is readable by the Doris process. After updating be.conf, restart the affected BE nodes to apply the change.
Parameter Overview
| Property Name | Legacy Name | Description | Default | Required |
|---|---|---|---|---|
| s3.endpoint | S3 service access endpoint, e.g., s3.us-east-1.amazonaws.com | None | No | |
| s3.access_key | AWS Access Key for authentication | None | No | |
| s3.secret_key | AWS Secret Key for authentication | None | No | |
| s3.region | S3 region, e.g., us-east-1. Strongly recommended | None | Yes | |
| s3.use_path_style | Whether to use path-style access | FALSE | No | |
| s3.connection.maximum | Maximum number of connections for high concurrency scenarios | 50 | No | |
| s3.connection.request.timeout | Request timeout (milliseconds), controls connection acquisition timeout | 3000 | No | |
| s3.connection.timeout | Connection establishment timeout (milliseconds) | 1000 | No | |
| s3.role_arn | Role ARN specified when using Assume Role mode | None | No | |
| s3.external_id | External ID used with s3.role_arn | None | No | |
| s3.credentials_provider_type | Credentials provider type for AWS authentication (used without AK/SK; used as STS source credentials in IAM Role mode) | DEFAULT | No |
Version note:
s3.credentials_provider_typeis supported since 3.1.4 and 4.0.3.
Authentication Configuration
Doris supports the following three methods to access S3:
1. Direct Access Key and Secret Key (AK/SK)
"s3.access_key"="your-access-key",
"s3.secret_key"="your-secret-key",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"
2. IAM Role (Assume Role) mode
Suitable for cross-account and temporary authorization access. Doris automatically obtains temporary credentials through role authorization.
"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.external_id"="external-identifier",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"
Configure s3.credentials_provider_type in IAM Role mode
When s3.role_arn is configured, s3.credentials_provider_type controls which source credentials provider is used for STS AssumeRole:
- Get source credentials from
s3.credentials_provider_type. - Call STS
AssumeRolewith source credentials. - Access S3 with the returned temporary credentials.
IAM Role + s3.credentials_provider_type examples
Example 1: EC2 Instance Profile as STS source credentials
"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.external_id"="external-identifier",
"s3.credentials_provider_type"="INSTANCE_PROFILE",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"
Example 2: Web Identity (for example IRSA) as STS source credentials
"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.credentials_provider_type"="WEB_IDENTITY",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"
Example 3: Container metadata as STS source credentials
"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.credentials_provider_type"="CONTAINER",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"
Example 4: Default provider chain as STS source credentials
"s3.role_arn"="arn:aws:iam::123456789012:role/demo-role",
"s3.credentials_provider_type"="DEFAULT",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"
3. Specify credential source with s3.credentials_provider_type
This is suitable for scenarios without explicit AK/SK, such as EC2 Instance Profile, container metadata, or Web Identity.
"s3.credentials_provider_type"="INSTANCE_PROFILE",
"s3.endpoint"="s3.us-east-1.amazonaws.com",
"s3.region"="us-east-1"
Supported values for s3.credentials_provider_type
| Value | Description |
|---|---|
| DEFAULT | Use default provider chain |
| ENV | Read credentials from environment variables |
| SYSTEM_PROPERTIES | Read credentials from system properties |
| WEB_IDENTITY | Use Web Identity Token credentials |
| CONTAINER | Use container metadata credentials |
| INSTANCE_PROFILE | Use EC2 Instance Profile credentials |
| ANONYMOUS | Anonymous access (for public buckets) |
Effective rules when configured together
- If
s3.access_keyands3.secret_keyare both configured, AK/SK is used first. - If AK/SK is not configured and
s3.role_arnis configured, IAM Role is used. In this case,s3.credentials_provider_typeis used to select STS source credentials. - If neither AK/SK nor
s3.role_arnis configured,s3.credentials_provider_typedirectly determines the credentials provider used by the S3 client.
Note:
s3.access_keyands3.secret_keymust be configured together.
For instructions on AWS authentication and authorization configuration, please refer to the document aws-authentication-and-authorization
Accessing S3 Directory Bucket
This feature is supported since version 3.1.0.
Amazon S3 Express One Zone (also known as Directory Bucket) provides higher performance, but has a different endpoint format.
- Regular bucket: s3.us-east-1.amazonaws.com
- Directory Bucket: s3express-usw2-az1.us-west-2.amazonaws.com
For more available regions, refer to: AWS Official Documentation
Example:
"s3.access_key"="ak",
"s3.secret_key"="sk",
"s3.endpoint"="s3express-usw2-az1.us-west-2.amazonaws.com",
"s3.region"="us-west-2"
Permission Policies
Depending on the use case, permissions can be categorized into read-only and read-write policies.
1. Read-only Permissions
Only allows reading objects from S3. Suitable for LOAD, TVF, querying EXTERNAL CATALOG, and other scenarios.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
],
"Resource": "arn:aws:s3:::<your-bucket>/your-prefix/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": "arn:aws:s3:::<your-bucket>"
}
]
}
2. Read-write Permissions
Based on read-only permissions, additionally allows deleting, creating, and modifying objects. Suitable for EXPORT, OUTFILE, and EXTERNAL CATALOG write-back scenarios.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:GetObjectVersion",
"s3:DeleteObject",
"s3:DeleteObjectVersion",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": "arn:aws:s3:::<your-bucket>/<your-prefix>/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:GetBucketVersioning",
"s3:GetLifecycleConfiguration"
],
"Resource": "arn:aws:s3:::<your-bucket>"
}
]
}
Notes
-
Placeholder Replacement
<bucket>→ Your S3 Bucket name.<account-id>→ Your AWS account ID (12-digit number).
-
Principle of Least Privilege
- If only querying, do not grant write permissions.