Paimon Catalog
Doris currently supports accessing Paimon table metadata through various metadata services and querying Paimon data.
At present, only read operations on Paimon tables are supported. Write operations to Paimon tables will be supported in the future.
Quick start with Apache Doris and Apache Paimon.
Applicable Scenariosβ
Scenario | Description |
---|---|
Query Acceleration | Use Doris's distributed computing engine to directly access Paimon data for query acceleration. |
Data Integration | Read Paimon data and write it into Doris internal tables, or perform ZeroETL operations using the Doris computing engine. |
Data Write-back | Not supported yet. |
Configuring Catalogβ
Syntaxβ
CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
'type' = 'paimon',
'paimon.catalog.type' = '<paimon_catalog_type>',
'warehouse' = '<paimon_warehouse>'
{MetaStoreProperties},
{StorageProperties},
{CommonProperties}
);
-
<paimon_catalog_type>
The type of Paimon Catalog, supporting the following:
-
filesystem
: Default. Directly accesses metadata stored on the file system. -
hms
: Uses Hive Metastore as the metadata service. -
dlf
: Uses Alibaba Cloud DLF as the metadata service.
-
-
<paimon_warehouse>
The warehouse path for Paimon. This parameter must be specified when
<paimon_catalog_type>
isfilesystem
.The
warehouse
path must point to the level above theDatabase
path. For example, if your table path is:s3://bucket/path/to/db1/table1
, thenwarehouse
should be:s3://bucket/path/to/
. -
{MetaStoreProperties}
The MetaStoreProperties section is used to fill in connection and authentication information for the Metastore metadata service. Refer to the section on [Supported Metadata Services] for details.
-
{StorageProperties}
The StorageProperties section is used to fill in connection and authentication information related to the storage system. Refer to the section on [Supported Storage Systems] for details.
-
{CommonProperties}
The CommonProperties section is used to fill in common properties. Please refer to the Catalog Overview section on [Common Properties].
Supported Paimon Versionsβ
The currently dependent Paimon version is 1.0.0.
Supported Paimon Formatsβ
- Supports reading Paimon Deletion Vector
Supported Metadata Servicesβ
Supported Storage Systemsβ
Supported Data Formatsβ
Column Type Mappingβ
Paimon Type | Doris Type | Comment |
---|---|---|
boolean | boolean | |
tinyint | tinyint | |
smallint | smallint | |
integer | int | |
bigint | bigint | |
float | float | |
double | double | |
decimal(P, S) | decimal(P, S) | |
varchar | string | |
char | string | |
binary | string | |
varbinary | string | |
date | date | |
timestamp_without_time_zone | datetime(N) | Mapped according to precision. If precision is greater than 6, it maps to a maximum of 6 (may cause precision loss). |
timestamp_with_local_time_zone | datetime(N) | Mapped according to precision. If precision is greater than 6, it maps to a maximum of 6 (may cause precision loss). |
array | array | |
map | map | |
row | struct | |
other | UNSUPPORTED |
Examplesβ
Paimon on HDFSβ
CREATE CATALOG paimon_hdfs PROPERTIES (
'type' = 'paimon',
'warehouse' = 'hdfs://HDFS8000871/user/paimon',
'dfs.nameservices' = 'HDFS8000871',
'dfs.ha.namenodes.HDFS8000871' = 'nn1,nn2',
'dfs.namenode.rpc-address.HDFS8000871.nn1' = '172.21.0.1:4007',
'dfs.namenode.rpc-address.HDFS8000871.nn2' = '172.21.0.2:4007',
'dfs.client.failover.proxy.provider.HDFS8000871' = 'org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider',
'hadoop.username' = 'hadoop'
);
Paimon on HMSβ
CREATE CATALOG paimon_hms PROPERTIES (
'type' = 'paimon',
'paimon.catalog.type' = 'hms',
'warehouse' = 'hdfs://HDFS8000871/user/zhangdong/paimon2',
'hive.metastore.uris' = 'thrift://172.21.0.44:7004',
'dfs.nameservices' = 'HDFS8000871',
'dfs.ha.namenodes.HDFS8000871' = 'nn1,nn2',
'dfs.namenode.rpc-address.HDFS8000871.nn1' = '172.21.0.1:4007',
'dfs.namenode.rpc-address.HDFS8000871.nn2' = '172.21.0.2:4007',
'dfs.client.failover.proxy.provider.HDFS8000871' = 'org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider',
'hadoop.username' = 'hadoop'
);
Paimon on DLFβ
CREATE CATALOG paimon_dlf PROPERTIES (
'type' = 'paimon',
'paimon.catalog.type' = 'dlf',
'warehouse' = 'oss://xx/yy/',
'dlf.proxy.mode' = 'DLF_ONLY',
'dlf.uid' = 'xxxxx',
'dlf.region' = 'cn-beijing',
'dlf.access_key' = 'ak',
'dlf.secret_key' = 'sk'
);
Query Operationsβ
Basic Queryβ
Once the Catalog is configured, you can query the table data in the Catalog as follows:
-- 1. Switch to catalog, use database, and query
SWITCH paimon_ctl;
USE paimon_db;
SELECT * FROM paimon_tbl LIMIT 10;
-- 2. Use Paimon database directly
USE paimon_ctl.paimon_db;
SELECT * FROM paimon_tbl LIMIT 10;
-- 3. Use fully qualified name to query
SELECT * FROM paimon_ctl.paimon_db.paimon_tbl LIMIT 10;