Performance and Tuning

The adaptive optimizer and Pipeline execution engine in Apache Doris let most workloads work out of the box, but production environments usually still require systematic performance tuning. Start by following the tuning methodology to locate bottlenecks, then optimize for specific scenarios on the query or ingestion side. When you encounter an execution plan that is hard to explain, consult the optimization technology principles to understand the mechanisms behind it.

Tuning Methodology

Performance Tuning Overview

Learn the end-to-end tuning workflow, the diagnostic and analysis tools that Doris provides, and how to systematically locate performance issues.

Query Performance

Schema and Index Optimization

Set the upper bound of query performance through table model, partitioning and bucketing, key column design, and index choices (prefix index, BloomFilter, NGram, inverted).

Materialized View

Pre-compute results with synchronous or asynchronous materialized views, and let transparent rewriting accelerate existing SQL without modifications.

Join Optimization

Eliminate Shuffle with Colocation Join, and use Distribute and Leading Hint to manually adjust the Join plan when the optimizer's decision is suboptimal.

Caching Acceleration

Combine SQL Cache, Condition Cache, and external table file cache to reuse results, filter computations, and remote data across repeated queries.

Execution Tuning

Based on runtime bottlenecks in the Profile, adjust parallelism, RuntimeFilter wait time, data skew, and CBO rules.

High Concurrency and Point Queries

Use row store and short-circuit execution to support high-QPS primary key queries on Unique Key tables, and use dictionary tables instead of dimension table joins to accelerate KV queries.

Distinct Counts

Use BITMAP for exact deduplication, or use HLL for approximate UV computation when 1%-2% error is acceptable, significantly reducing memory and storage overhead.

Query Profile

Analyze the time distribution of operators such as Scan, Exchange, Join, and aggregation through the Profile to quickly locate the slowest stage of a query.

Ingestion Performance

DML Tuning

For the execution plans of INSERT, UPDATE, and DELETE, adjust write parallelism, batch size, and execution path to achieve stable ingestion performance.

Optimization Technology Principles

Query Optimizer

Introduces rule rewriting and the cost model of the Nereids optimizer, including rule-based equivalent transformations and cost-based Join Enumeration.

Pipeline Execution Engine

Introduces the Pipeline model, scheduling mechanism, and Morsel-Driven parallel execution, which form the foundation of the Doris execution layer.

Runtime Filter

Introduces the execution mechanism by which RuntimeFilter is generated on the Join build side and pushed down to the probe side to skip irrelevant data.

TopN Optimization

Introduces short-circuit execution and partial sorting optimization in ORDER BY ... LIMIT scenarios, which skip most data to quickly return the TopN.

Statistics

Introduces the collection and use of statistics such as tables, columns, and histograms, which are the foundation for the cost-based query optimizer to choose plans.

Benchmarks

Star Schema Benchmark (SSB)

SSB test results of Doris on standard hardware, including reproducible environment configuration and tuning highlights.

TPC-H

TPC-H test results of Doris, covering data ingestion, query latency, and comparison guidelines.

TPC-DS

TPC-DS test results of Doris in more complex multi-table analytical scenarios.