Skip to main content
Skip to main content

Doris Compute-Storage Decoupled Deployment Preparation

1. Overview

This document describes the deployment preparation work for the Apache Doris compute-storage decoupled mode. The decoupled architecture aims to improve system scalability and performance, suitable for large-scale data processing scenarios.

2. Architecture Components

The Doris compute-storage decoupled architecture consists of three main modules:

  1. Frontend (FE): Handles user requests and manages metadata.
  2. Backend (BE): Stateless compute nodes that execute query tasks.
  3. Meta Service (MS): Manages metadata operations and data recovery.

3. System Requirements

3.1 Hardware Requirements

  • Minimum configuration: 3 servers
  • Recommended configuration: 5 or more servers

3.2 Software Dependencies

  • FoundationDB (FDB) version 7.1.38 or higher
  • OpenJDK 17

4. Deployment Planning

4.1 Testing Environment Deployment

Deploy all modules on a single machine, not suitable for production environments.

4.2 Production Deployment

  • Deploy FDB on 3 or more machines
  • Deploy FE and Meta Service on 3 or more machines
  • Deploy BE on 3 or more machines

When machine configurations are high, consider mixing FDB, FE, and Meta Service, but do not mix disks.

5. Installation Steps

5.1 Install FoundationDB

This section provides a step-by-step guide to configure, deploy, and start the FoundationDB (FDB) service using the provided scripts fdb_vars.sh and fdb_ctl.sh. You can download doris tools and get fdb_vars.sh and fdb_ctl.sh from fdb directory.

5.1.1 Machine Requirements

Typically, at least 3 machines equipped with SSDs are required to form a FoundationDB cluster with dual data replicas and allow for single machine failures.

tip

If only for development/testing purposes, a single machine is sufficient.

5.1.2 fdb_vars.sh Configuration

Required Custom Settings
ParameterDescriptionTypeExampleNotes
DATA_DIRSSpecify the data directory for FoundationDB storageComma-separated list of absolute paths/mnt/foundationdb/data1,/mnt/foundationdb/data2,/mnt/foundationdb/data3- Ensure directories are created before running the script
- SSD and separate directories are recommended for production environments
FDB_CLUSTER_IPSDefine cluster IPsString (comma-separated IP addresses)172.200.0.2,172.200.0.3,172.200.0.4- At least 3 IP addresses for production clusters
- The first IP will be used as the coordinator
- For high availability, place machines in different racks
FDB_HOMEDefine the main directory for FoundationDBAbsolute path/fdbhome- Default path is /fdbhome
- Ensure this path is absolute
FDB_CLUSTER_IDDefine the cluster IDStringSAQESzbh- Each cluster ID must be unique
- Can be generated using mktemp -u XXXXXXXX
FDB_CLUSTER_DESCDefine the description of the FDB clusterStringdorisfdb- It is recommended to change this to something meaningful for the deployment
Optional Custom Settings
ParameterDescriptionTypeExampleNotes
MEMORY_LIMIT_GBDefine the memory limit for FDB processes in GBIntegerMEMORY_LIMIT_GB=16Adjust this value based on available memory resources and FDB process requirements
CPU_CORES_LIMITDefine the CPU core limit for FDB processesIntegerCPU_CORES_LIMIT=8Set this value based on the number of available CPU cores and FDB process requirements

5.1.3 Deploy FDB Cluster

After configuring the environment with fdb_vars.sh, you can deploy the FDB cluster on each node using the fdb_ctl.sh script.

./fdb_ctl.sh deploy

This command initiates the deployment process of the FDB cluster.

5.1.4 Start FDB Service

Once the FDB cluster is deployed, you can start the FDB service using the fdb_ctl.sh script.

./fdb_ctl.sh start

This command starts the FDB service, making the cluster operational and obtaining the FDB cluster connection string, which can be used for configuring the MetaService.

5.2 Install OpenJDK 17

  1. Download OpenJDK 17
  2. Extract and set the environment variable JAVA_HOME.

5.3 Install S3 or HDFS (Optional)

The Apache Doris (cloud mode) stores data on S3 or HDFS services. If you already have the relevant services, you can use them directly. If not, this document provides a simple deployment tutorial for MinIO:

  1. Choose the appropriate version and operating system on 在 MinIO MinIO's download page and download the corresponding binary or installation packages for the server and client.
  2. start MinIO Server
    export MINIO_REGION_NAME=us-east-1
    export MINIO_ROOT_USER=minio # In older versions, the configuration is MINIO_ACCESS_KEY=minio
    export MINIO_ROOT_PASSWORD=minioadmin # In older versions, the configuration is MINIO_SECRET_KEY=minioadmin
    nohup ./minio server /mnt/data 2>&1 &
  3. config MinIO Client
    # If you are using a client installed with an installation package, the client name is mcli. If you directly download the client binary package, its name is mc
    ./mc config host add myminio http://127.0.0.1:9000 minio minioadmin
  4. create a bucket
    ./mc mb myminio/doris
  5. verify if it is working properly
    # upload a file
    ./mc mv test_file myminio/doris
    # list files
    ./mc ls myminio/doris

6. Next Steps

After completing the above preparations, please refer to the following documents to continue the deployment:

  1. Deployment
  2. Managing Compute Group
  3. Managing Storage Vault

7. Notes

  • Ensure time synchronization across all nodes
  • Regularly back up FoundationDB data
  • Adjust FoundationDB and Doris configuration parameters based on actual load

8. References