Everything you need to know
about Apache Doris
A reference index for the core technologies and key features of Apache Doris
- Analytic Functions
- Batch Load
- Binlog / Table Stream
- BM25 Relevance ScoringApache Doris ranks full-text search results by BM25 relevance, so SQL queries can sort matches the way a search engine would.
- Catalog Integrations
- Columnar StorageDoris's on-disk segment format. Per-column encoding and per-page compression shrink the table; per-page indexes let queries skip unread pages.
- Compute Group
- Condition Cache
- Data Cache & Page Cache
- Data Compaction
- Data LineageColumn-level lineage extracted from the Nereids plan and shipped to your governance tool through a pluggable SPI, plus table-level traces in the audit log.
- Data Model
- Data Pruning
- Data Update and Delete
- Embedding
- Full-text Search
- Group CommitMerge many small INSERTs and Stream Loads server-side into one transaction, with sync and async modes for the latency vs throughput tradeoff.
- High-Concurrency Point Query
- Hybrid Search
- Iceberg
- Incremental Materialized View
- Inverted Index
- Kafka and CDC Integration
- LLM SQL FunctionsAI_* SQL functions in Doris send column values to an LLM and return the result inline, so classification, extraction, and summarization stay in SQL.
- Load Transactions
- Managing Lake TablesDoris writes and manages Iceberg, Hive, and Paimon tables through SQL. CREATE, INSERT, UPDATE, DELETE, schema and partition evolution, branches, snapshots.
- MCP Server
- Metadata Cache
- MPP Architecture
- Multi Catalog
- Parquet Reader OptimizationA native C++ vectorized Parquet reader that prunes row groups and pages, decodes dictionaries directly, and reads payloads only after filters.
- Partitioning and Bucketing
- Pipeline Execution Engine
- Pluggable Authentication and Authorization
- Preaggregation and Rollup
- Prepared Statement
- Query CacheA pipeline-level cache that stores intermediate aggregation results at tablet granularity, so queries that share tablets reuse work.
- Reciprocal Rank Fusion
- Resource Group
- Spill to Disk
- Stream Load
- Unique Key
- VARIANT Data Type
- Vector IndexA native ANN index on ARRAY<FLOAT> columns. Build it inside a Doris table and run millisecond TopN vector search next to your SQL analytics.
- Vectorized Execution
- Vertical CompactionColumn-group-based compaction that keeps memory bounded when Doris merges rowsets on wide tables.
- Workload Group