Upgrading Apache Doris cluster deployed by Doris Operator
This document describes how to use updates to upgrade an Apache Doris cluster based on a Doris Operator deployment.
Similar to conventionally deployed cluster upgrades, Doris clusters deployed by Doris Operator still require rolling upgrades from BE to FE nodes. Doris Operator is based on Kubernetes' Performing a Rolling Update provides rolling upgrade capabilities.
Things to note before upgrading
- It is recommended that the upgrade operation be performed during off-peak periods.
- During the rolling upgrade process, the connection to the closed node will fail, causing the request to fail. For this type of business, it is recommended to add retry capabilities to the client.
- Before upgrading, you can read the General Upgrade Manual to help you understand some principles and precautions during the upgrade. .
- The compatibility of data and metadata cannot be verified before upgrading. Therefore, cluster upgrade must avoid single copy of data and single FE FOLLOWER node in the cluster.
- Nodes will be restarted during the upgrade process, so unnecessary cluster balancing and replica repair logic may be triggered. Please shut it down first with the following command.
admin set frontend config("disable_balance" = "true");
admin set frontend config("disable_colocate_balance" = "true");
admin set frontend config("disable_tablet_scheduler" = "true");
- When upgrading Doris, please follow the principle of not upgrading across two or more key node versions. If you want to upgrade across multiple key node versions, upgrade to the latest key node version first, and then upgrade in sequence. If it is a non-key node version, You can ignore skipping. For details, please refer to Upgrade Version Instructions
Upgrade operation
The order of node types in the upgrade process is as follows. If a certain type of node does not exist, it will be skipped:
cn/be -> fe -> broker
It is recommended to modify the image
of the corresponding cluster components in sequence and then apply the configuration. After the current type of component is fully upgraded and the status returns to normal, the rolling upgrade of the next type of node can be performed.
Upgrade BE
If you retain the cluster's crd (Doris Operator defines the abbreviation of DorisCluster
type resource name) file, you can upgrade by modifying the configuration file and running the kubectl apply
command.
Modify
spec.beSpec.image
Change
selectdb/doris.be-ubuntu:2.0.4
toselectdb/doris.be-ubuntu:2.1.0
$ vim doriscluster-sample.yaml
- Save the changes and apply the changes to be upgraded:
$ kubectl apply -f doriscluster-sample.yaml -n doris
It can also be modified directly through kubectl edit dcr
.
- Check the dcr list under namespace 'doris' to obtain the
cluster_name
that needs to be updated.
$ kubectl get dcr -n doris
NAME FESTATUS BESTATUS CNSTATUS
doriscluster-sample available available
- Modify, save and take effect
$ kubectl edit dcr doriscluster-sample -n doris
After entering the text editor, you will find `spec.beSpec.image` and change `selectdb/doris.be-ubuntu:2.0.4` to `selectdb/doris.be-ubuntu:2.1.0`
- View the upgrade process and results:
$ kubectl get pod -n doris
When all Pods are rebuilt and enter the Running state, the upgrade is complete.
Upgrade FE
If you retain the cluster's crd (Doris Operator defines the abbreviation of the DorisCluster
type resource name) file, you can upgrade by modifying the configuration file and running the kubectl apply
command.
Modify
spec.feSpec.image
Change
selectdb/doris.fe-ubuntu:2.0.4
toselectdb/doris.fe-ubuntu:2.1.0
$ vim doriscluster-sample.yaml
- Save the changes and apply the changes to be upgraded:
$ kubectl apply -f doriscluster-sample.yaml -n doris
It can also be modified directly through kubectl edit dcr
.
- Modify, save and take effect
$ kubectl edit dcr doriscluster-sample -n doris
After entering the text editor, you will find `spec.feSpec.image` and change `selectdb/doris.fe-ubuntu:2.0.4` to `selectdb/doris.fe-ubuntu:2.1.0`
- View the upgrade process and results:
$ kubectl get pod -n doris
When all Pods are rebuilt and enter the Running state, the upgrade is complete.
After the upgrade is completed
Verify cluster node status
Access Doris through mysql-client
through the method provided in the Access Doris Cluster document.
Use SQL such as show frontends
and show backends
to view the version and status of each component.
mysql> show frontends\G;
*************************** 1. row ***************************
Name: fe_13c132aa_3281_4f4f_97e8_655d01287425
Host: doriscluster-sample-fe-0.doriscluster-sample-fe-internal.doris.svc.cluster.local
EditLogPort: 9010
HttpPort: 8030
QueryPort: 9030
RpcPort: 9020
ArrowFlightSqlPort: -1
Role: FOLLOWER
IsMaster: false
ClusterId: 1779160761
Join: true
Alive: true
ReplayedJournalId: 2422
LastStartTime: 2024-02-19 06:38:47
LastHeartbeat: 2024-02-19 09:31:33
IsHelper: true
ErrMsg:
Version: doris-2.1.0
CurrentConnected: Yes
*************************** 2. row ***************************
Name: fe_f1a9d008_d110_4780_8e60_13d392faa54e
Host: doriscluster-sample-fe-2.doriscluster-sample-fe-internal.doris.svc.cluster.local
EditLogPort: 9010
HttpPort: 8030
QueryPort: 9030
RpcPort: 9020
ArrowFlightSqlPort: -1
Role: FOLLOWER
IsMaster: true
ClusterId: 1779160761
Join: true
Alive: true
ReplayedJournalId: 2423
LastStartTime: 2024-02-19 06:37:35
LastHeartbeat: 2024-02-19 09:31:33
IsHelper: true
ErrMsg:
Version: doris-2.1.0
CurrentConnected: No
*************************** 3. row ***************************
Name: fe_e42bf9da_006f_4302_b861_770d2c955a47
Host: doriscluster-sample-fe-1.doriscluster-sample-fe-internal.doris.svc.cluster.local
EditLogPort: 9010
HttpPort: 8030
QueryPort: 9030
RpcPort: 9020
ArrowFlightSqlPort: -1
Role: FOLLOWER
IsMaster: false
ClusterId: 1779160761
Join: true
Alive: true
ReplayedJournalId: 2422
LastStartTime: 2024-02-19 06:38:17
LastHeartbeat: 2024-02-19 09:31:33
IsHelper: true
ErrMsg:
Version: doris-2.1.0
CurrentConnected: No
3 rows in set (0.02 sec)
If the Alive
status of the FE node is true and the Version
value is the new version, the FE node is upgraded successfully.
mysql> show backends\G;
*************************** 1. row ***************************
BackendId: 10002
Host: doriscluster-sample-be-0.doriscluster-sample-be-internal.doris.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
ArrowFlightSqlPort: -1
LastStartTime: 2024-02-19 06:37:56
LastHeartbeat: 2024-02-19 09:32:43
Alive: true
SystemDecommissioned: false
TabletNum: 14
DataUsedCapacity: 0.000
TrashUsedCapcacity: 0.000
AvailCapacity: 12.719 GB
TotalCapacity: 295.167 GB
UsedPct: 95.69 %
MaxDiskUsedPct: 95.69 %
RemoteUsedCapacity: 0.000
Tag: {"location" : "default"}
ErrMsg:
Version: doris-2.1.0
Status: {"lastSuccessReportTabletsTime":"2024-02-19 09:31:48","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
HeartbeatFailureCounter: 0
NodeRole: mix
*************************** 2. row ***************************
BackendId: 10003
Host: doriscluster-sample-be-1.doriscluster-sample-be-internal.doris.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
ArrowFlightSqlPort: -1
LastStartTime: 2024-02-19 06:37:35
LastHeartbeat: 2024-02-19 09:32:43
Alive: true
SystemDecommissioned: false
TabletNum: 8
DataUsedCapacity: 0.000
TrashUsedCapcacity: 0.000
AvailCapacity: 12.719 GB
TotalCapacity: 295.167 GB
UsedPct: 95.69 %
MaxDiskUsedPct: 95.69 %
RemoteUsedCapacity: 0.000
Tag: {"location" : "default"}
ErrMsg:
Version: doris-2.1.0
Status: {"lastSuccessReportTabletsTime":"2024-02-19 09:31:43","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
HeartbeatFailureCounter: 0
NodeRole: mix
*************************** 3. row ***************************
BackendId: 11024
Host: doriscluster-sample-be-2.doriscluster-sample-be-internal.doris.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
ArrowFlightSqlPort: -1
LastStartTime: 2024-02-19 08:50:36
LastHeartbeat: 2024-02-19 09:32:43
Alive: true
SystemDecommissioned: false
TabletNum: 0
DataUsedCapacity: 0.000
TrashUsedCapcacity: 0.000
AvailCapacity: 12.719 GB
TotalCapacity: 295.167 GB
UsedPct: 95.69 %
MaxDiskUsedPct: 95.69 %
RemoteUsedCapacity: 0.000
Tag: {"location" : "default"}
ErrMsg:
Version: doris-2.1.0
Status: {"lastSuccessReportTabletsTime":"2024-02-19 09:32:04","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
HeartbeatFailureCounter: 0
NodeRole: mix
3 rows in set (0.01 sec)
If the Alive
status of the BE node is true and the Version
value is the new version, the BE node is upgraded successfully.
Restore cluster replica synchronization and balancing
After confirming that the status of each node is correct, execute the following SQL to restore cluster balancing and replica repair:
admin set frontend config("disable_balance" = "false");
admin set frontend config("disable_colocate_balance" = "false");
admin set frontend config("disable_tablet_scheduler" = "false");