Skip to main content

Overview

Creating tables​

Users can use the CREATE TABLE statement to create a table in Doris. You can also use the CREATE TABLE LIKE or CREATE TABLE AS clause to derive the table definition from another table.

Table name​

In Doris, table names are case-sensitive by default. You can configure lower_case_table_namesto make them case-insensitive during the initial cluster setup. The default maximum length for table names is 64 bytes, but you can change this by configuring table_name_length_limit. It is not recommended to set this value too high. For syntax on creating tables, please refer to CREATE TABLE.

Table property​

In Doris, the CREATE TABLE statement can specify table properties, including:

  • buckets: Determines the distribution of data within the table.

  • storage_medium: Controls the storage method for data, such as using HDD, SSD, or remote shared storage.

  • replication_num: Controls the number of data replicas to ensure redundancy and reliability.

  • storage_policy: Controls the migration strategy for cold and hot data separation storage.

These properties apply to partitions, meaning that once a partition is created, it will have its own properties. Modifying table properties will only affect partitions created in the future and will not affect existing partitions. For more information about table properties, refer to ALTER TABLE PROPERTY.

Notes​

  1. Choose an appropriate data model: The data model cannot be changed, so you need to select an appropriate data model when creating the table.

  2. Choose an appropriate number of buckets: The number of buckets in an already created partition cannot be modified. You can modify the number of buckets by replacing the partition, or you can modify the number of buckets for partitions that have not yet been created in dynamic partitions.

  3. Column addition operations: Adding or removing VALUE columns is a lightweight operation that can be completed in seconds. Adding or removing KEY columns or modifying data types is a heavyweight operation, and the completion time depends on the amount of data. For large datasets, it is recommended to avoid adding or removing KEY columns or modifying data types.

  4. Optimize storage strategy: You can use tiered storage to store cold data on HDD or S3/HDFS.