Skip to main content

Doris Compute-Storage Decoupled Deployment Preparation

1. Overview

This document describes the deployment preparation work for the Apache Doris compute-storage decoupled mode. The decoupled architecture aims to improve system scalability and performance, suitable for large-scale data processing scenarios.

2. Architecture Components

The Doris compute-storage decoupled architecture consists of three main modules:

  1. Frontend (FE): Handles user requests and manages metadata.
  2. Backend (BE): Stateless compute nodes that execute query tasks.
  3. Meta Service (MS): Manages metadata operations and data recovery.

3. System Requirements

3.1 Hardware Requirements

  • Minimum configuration: 3 servers
  • Recommended configuration: 5 or more servers

3.2 Software Dependencies

  • FoundationDB (FDB) version 7.1.38 or higher
  • OpenJDK 17

4. Deployment Planning

4.1 Testing Environment Deployment

Deploy all modules on a single machine, not suitable for production environments.

4.2 Production Deployment

  • Deploy FDB on 3 or more machines
  • Deploy FE and Meta Service on 3 or more machines
  • Deploy BE on 3 or more machines

When machine configurations are high, consider mixing FDB, FE, and Meta Service, but do not mix disks.

5. Installation Steps

5.1 Install FoundationDB

This section provides a step-by-step guide to configure, deploy, and start the FoundationDB (FDB) service using the provided scripts fdb_vars.sh and fdb_ctl.sh. You can download doris tools and get fdb_vars.sh and fdb_ctl.sh from fdb directory.

5.1.1 Machine Requirements

Typically, at least 3 machines equipped with SSDs are required to form a FoundationDB cluster with dual data replicas and allow for single machine failures. If SSDs are not available, at least standard cloud disks or local disks with a standard POSIX-compliant file system must be used for data storage. Otherwise, FoundationDB may fail to operate properly - for instance, storage solutions like JuiceFS should not be used as the underlying storage for FoundationDB.

tip

If only for development/testing purposes, a single machine is sufficient.

5.1.2 fdb_vars.sh Configuration

Required Custom Settings
ParameterDescriptionTypeExampleNotes
DATA_DIRSSpecify the data directory for FoundationDB storageComma-separated list of absolute paths/mnt/foundationdb/data1,/mnt/foundationdb/data2,/mnt/foundationdb/data3- Ensure directories are created before running the script
- SSD and separate directories are recommended for production environments
FDB_CLUSTER_IPSDefine cluster IPsString (comma-separated IP addresses)172.200.0.2,172.200.0.3,172.200.0.4- At least 3 IP addresses for production clusters
- The first IP will be used as the coordinator
- For high availability, place machines in different racks
FDB_HOMEDefine the main directory for FoundationDBAbsolute path/fdbhome- Default path is /fdbhome
- Ensure this path is absolute
FDB_CLUSTER_IDDefine the cluster IDStringSAQESzbh- Each cluster ID must be unique
- Can be generated using mktemp -u XXXXXXXX
FDB_CLUSTER_DESCDefine the description of the FDB clusterStringdorisfdb- It is recommended to change this to something meaningful for the deployment
Optional Custom Settings
ParameterDescriptionTypeExampleNotes
MEMORY_LIMIT_GBDefine the memory limit for FDB processes in GBIntegerMEMORY_LIMIT_GB=16Adjust this value based on available memory resources and FDB process requirements
CPU_CORES_LIMITDefine the CPU core limit for FDB processesIntegerCPU_CORES_LIMIT=8Set this value based on the number of available CPU cores and FDB process requirements

5.1.3 Deploy FDB Cluster

After configuring the environment with fdb_vars.sh, you can deploy the FDB cluster on each node using the fdb_ctl.sh script.

./fdb_ctl.sh deploy

This command initiates the deployment process of the FDB cluster.

5.1.4 Start FDB Service

Once the FDB cluster is deployed, you can start the FDB service on each node using the fdb_ctl.sh script.

./fdb_ctl.sh start

This command starts the FDB service, making the cluster operational and obtaining the FDB cluster connection string, which can be used for configuring the MetaService.

5.2 Install OpenJDK 17

  1. Download OpenJDK 17
  2. Extract and set the environment variable JAVA_HOME.

6. Next Steps

After completing the above preparations, please refer to the following documents to continue the deployment:

  1. Deployment
  2. Managing Compute Group
  3. Managing Storage Vault

7. Notes

  • Ensure time synchronization across all nodes
  • Regularly back up FoundationDB data
  • Adjust FoundationDB and Doris configuration parameters based on actual load
  • Use standard cloud disks or local disks with a POSIX-compliant file system for data storage; otherwise, FoundationDB may not function properly.
    • For example, storage solutions like JuiceFS should not be used as FoundationDB's storage backend.

8. References