Skip to main content

Configure FE

What you will learn in this chapter

  • How to configure compute resources (CPU and memory) for the FE component
  • How to configure the number of FE Followers and their roles
  • How to customize FE startup parameters with a ConfigMap
  • How to choose the access mode for the FE service (ClusterIP/NodePort/LoadBalancer) based on your access scenario
  • How to configure persistent storage for FE to prevent metadata loss

Configuration overview

In storage-compute separation mode, FE (Frontend) is mainly responsible for query parsing and planning. This chapter introduces FE configuration in the following order:

Configuration itemProblem it solves
Compute resource configurationExplicitly allocates CPU and memory to FE
Follower node count configurationPlans the metadata management nodes in a distributed deployment
Custom startup configurationOverrides default startup parameters via a ConfigMap
Access mode configurationExposes the FE service based on the access scenario (in-cluster, out-of-cluster, or cloud platform)
Persistent storage configurationPrevents metadata loss after FE restarts

Configure compute resources

In the deployment example provided by the Doris-Operator repository, FE has no resource limits by default. For production environments, it is recommended to explicitly configure FE compute resources via Kubernetes requests and limits.

The following example allocates 8c8Gi of compute resources to FE:

spec:
feSpec:
requests:
cpu: 8
memory: 8Gi
limits:
cpu: 8
memory: 8Gi

Apply this configuration to the DorisDisaggregatedCluster resource you want to deploy.

Configure the number of Follower nodes

The FE service has two roles, with the following responsibilities:

RoleResponsibilities
FollowerHandles SQL parsing and manages and stores metadata
ObserverHandles SQL parsing and shares the query and write load of Followers

Doris uses the bdbje storage system to manage metadata. The underlying implementation of bdbje is similar to a Paxos protocol algorithm. In a distributed deployment, multiple Follower nodes must be configured to participate in metadata management together.

When you deploy a Doris storage-compute separation cluster with the DorisDisaggregatedCluster resource, the default number of Followers is 1. You can adjust the number of Follower nodes via the electionNumber field. The following example sets the number of Followers to 3:

spec:
feSpec:
electionNumber: 3
Tip

After a storage-compute separation cluster is deployed, electionNumber cannot be modified. Plan the number of Followers before deployment.

Custom startup configuration

Doris Operator mounts the FE startup configuration through a Kubernetes ConfigMap. The configuration procedure is as follows:

Step 1: Write the FE startup ConfigMap

Define a ConfigMap that contains the FE startup configuration, as shown below:

apiVersion: v1
kind: ConfigMap
metadata:
name: fe-configmap
namespace: default
labels:
app.kubernetes.io/component: fe
data:
fe.conf: |
CUR_DATE=`date +%Y%m%d-%H%M%S`
# Log dir
LOG_DIR = ${DORIS_HOME}/log
# For jdk 17, this JAVA_OPTS will be used as default JVM options
JAVA_OPTS_FOR_JDK_17="-Djavax.security.auth.useSubjectCredsOnly=false -Xmx8192m -Xms8192m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOG_DIR -Xlog:gc*:$LOG_DIR/fe.gc.log.$CUR_DATE:time,uptime:filecount=10,filesize=50M --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens java.base/jdk.internal.ref=ALL-UNNAMED"
# INFO, WARN, ERROR, FATAL
sys_log_level = INFO
# NORMAL, BRIEF, ASYNC
sys_log_mode = NORMAL
# Default dirs to put jdbc drivers,default value is ${DORIS_HOME}/jdbc_drivers
# jdbc_drivers_dir = ${DORIS_HOME}/jdbc_drivers
http_port = 8030
rpc_port = 9020
query_port = 9030
edit_log_port = 9010
enable_fqdn_mode=true
deploy_mode = cloud

Step 2: Deploy the ConfigMap to the target namespace

Deploy the ConfigMap to the namespace where the DorisDisaggregatedCluster resides with the following command:

kubectl apply -n ${namespace} -f ${feConfigMapName}.yaml

Parameter description:

  • ${namespace}: the namespace where the DorisDisaggregatedCluster resides
  • ${feConfigMapName}: the name of the file that contains the configuration above

Step 3: Reference the ConfigMap in DorisDisaggregatedCluster

Update the DorisDisaggregatedCluster resource and mount the ConfigMap through the feSpec.configMaps array, as shown below:

spec:
feSpec:
replicas: 2
configMaps:
- name: fe-configmap
Tip

When customizing the startup configuration in a Kubernetes deployment, note the following two points:

  1. Do not add the meta_service_endpoint or cluster_id configuration. Doris-Operator injects these values automatically.
  2. You must set enable_fqdn_mode=true.

Access configuration

Doris-Operator uses Kubernetes Service to provide VIP and load balancing capabilities. Depending on the access scenario, you can choose one of the following three exposure modes:

Access modeApplicable scenarioDescription
ClusterIPAccess only from within the Kubernetes clusterThe default mode, no extra configuration is required
NodePortAccess from outside the Kubernetes cluster (commonly used in self-built clusters)Exposes the service via the host port
LoadBalancerAccess through a cloud load balancer in a cloud platform environmentThe load balancer is provided by the cloud service provider

The following sections describe the configuration of each mode.

ClusterIP mode

Kubernetes uses ClusterIP mode by default. This mode provides an internal address inside the Kubernetes cluster that can only be accessed from within the cluster.

Step 1: Configure ClusterIP

ClusterIP is the default access mode and no additional changes are required to use it.

Step 2: Get the Service access address

After the cluster is deployed, view the Services exposed by FE with the following command:

kubectl -n doris get svc

A sample result is shown below:

NAME                              TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                               AGE
doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 14m
doriscluster-sample-fe ClusterIP 10.1.118.16 <none> 8030/TCP,9020/TCP,9030/TCP,9010/TCP 14m

The result contains two kinds of Services:

  • With the internal suffix: used only for Doris internal communication (such as heartbeat and data exchange), and is not exposed externally
  • Without the internal suffix: used for external access to the FE service

Step 3: Access Doris from inside a container

Run the following command to create a Pod that contains a MySQL client in the current Kubernetes cluster:

kubectl run mysql-client --image=mysql:5.7 -it --rm --restart=Never --namespace=doris -- /bin/bash

Inside the container, connect to the Doris cluster using the Service name without the internal suffix:

mysql -uroot -P9030 -hdoriscluster-sample-fe-service

NodePort mode

To access Doris from outside the Kubernetes cluster, use NodePort mode. NodePort mode supports two ways to allocate ports:

Allocation methodDescription
Dynamic host port allocationWhen no port mapping is set explicitly, Kubernetes automatically assigns an unused host port (default range 30000-32767)
Static host port allocationThe port mapping is specified explicitly. The port is fixed when the host port is not occupied and there is no conflict

Static allocation requires planning the port mapping. Doris provides the following ports for external interaction by default:

Table 1: FE service port descriptions

Port nameDefault portPort description
Query Port9030Used to access the Doris cluster via the MySQL protocol
HTTP Port8030The HTTP Server port on FE, used to view FE information

Step 1: Configure FE NodePort

Choose one of the following configurations as needed:

  • Dynamic port allocation:

    spec:
    feSpec:
    service:
    type: NodePort
  • Static port allocation:

    spec:
    feSpec:
    service:
    type: NodePort
    portMaps:
    - nodePort: 31001
    targetPort: 8030
    - nodePort: 31002
    targetPort: 9030

Step 2: Get the Service

After the cluster is deployed, view the Service with the following command:

kubectl get service

The result is shown below:

NAME                              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                       AGE
kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 169d
doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 2d
doriscluster-sample-fe NodePort 10.152.183.58 <none> 8030:31041/TCP,9020:30783/TCP,9030:31545/TCP,9010:31610/TCP 2d

Step 3: Access Doris through NodePort

Take a MySQL connection as an example. Assume the Doris Query Port is mapped to host port 31545. The steps are as follows:

  1. Get the IP address of any node in the Kubernetes cluster:

    kubectl get nodes -owide

    Sample result:

    NAME   STATUS   ROLES           AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE          KERNEL-VERSION          CONTAINER-RUNTIME
    r60 Ready control-plane 14d v1.28.2 192.168.88.60 <none> CentOS Stream 8 4.18.0-294.el8.x86_64 containerd://1.6.22
    r61 Ready <none> 14d v1.28.2 192.168.88.61 <none> CentOS Stream 8 4.18.0-294.el8.x86_64 containerd://1.6.22
    r62 Ready <none> 14d v1.28.2 192.168.88.62 <none> CentOS Stream 8 4.18.0-294.el8.x86_64 containerd://1.6.22
    r63 Ready <none> 14d v1.28.2 192.168.88.63 <none> CentOS Stream 8 4.18.0-294.el8.x86_64 containerd://1.6.22
  2. Use the IP of any node (such as 192.168.88.62) and the mapped port to connect to the Doris cluster:

    mysql -h 192.168.88.62 -P 31545 -uroot

LoadBalancer mode

LoadBalancer mode is suitable for Kubernetes environments on cloud platforms, where the load balancer is provided by the cloud service provider.

Step 1: Configure LoadBalancer mode

Set the type to LoadBalancer in feSpec.service:

spec:
feSpec:
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/load-balancer-type: "external"

Step 2: Get the Service

After the cluster is deployed, view the Service with the following command:

kubectl get service

Sample result:

NAME                              TYPE           CLUSTER-IP       EXTERNAL-IP                                                                     PORT(S)                                                       AGE
kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 169d
doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 2d
doriscluster-sample-fe LoadBalancer 10.152.183.58 ac4828493dgrftb884g67wg4tb68gyut-1137856348.us-east-1.elb.amazonaws.com 8030:31041/TCP,9020:30783/TCP,9030:31545/TCP,9010:31610/TCP 2d

Step 3: Access through the LoadBalancer

Take a MySQL connection as an example. Assume the Query Port listens on 9030. Connect to the Doris cluster with the following command:

mysql -h ac4828493dgrftb884g67wg4tb68gyut-1137856348.us-east-1.elb.amazonaws.com -P 9030 -uroot

Persistent storage

In the default deployment, the FE service uses Kubernetes EmptyDir as the metadata storage mode. Because EmptyDir is a non-persistent storage mode, metadata is lost after the service restarts.

To ensure FE metadata is not lost after a restart, configure persistent storage for FE. Doris Operator provides the following three options. Choose one based on your needs:

OptionApplicable scenario
Auto-generate using a storage templateLogs and metadata use the same storage configuration, simplifying the setup
Custom mount point configurationDifferent directories require different storage specifications
Do not persist logsLogs are written only to standard output and do not need to be persisted

Auto-generate using a storage template

Configure unified persistence for logs and metadata through a storage template, as shown below:

spec:
feSpec:
persistentVolumes:
- persistentVolumeClaimSpec:
# storageClassName: ${storageclass_name}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi

After the cluster is deployed with this configuration, the following takes effect:

  • Doris Operator automatically mounts persistent storage for the log directory (default /opt/apache-doris/fe/log) and the metadata directory (default /opt/apache-doris/fe/doris-meta)
  • If the log or metadata directory is explicitly specified in the custom startup configuration, Doris Operator parses it automatically and mounts the storage accordingly
  • Persistent storage uses StorageClass mode. You can specify the desired StorageClass through storageClassName

Custom mount point configuration

Doris Operator supports per-directory storage configuration. For example, mount a 300Gi disk for the log directory using a custom storage configuration, and mount a 200Gi disk for the metadata directory using the storage template:

spec:
feSpec:
persistentVolumes:
- mountPaths:
- /opt/apache-doris/fe/log
persistentVolumeClaimSpec:
# storageClassName: ${storageclass_name}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 300Gi
- persistentVolumeClaimSpec:
# storageClassName: ${storageclass_name}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
Tip

If the mountPaths array is empty, the storage configuration is treated as a template configuration (that is, it applies to all directories that are not configured separately).

Do not persist logs

If you do not want to persist logs and only want them written to standard output, use the following configuration:

spec:
feSpec:
logNotStore: true