Everything you need to know
about Apache Doris

A reference index for the core technologies and key features of Apache Doris

47 / 47features

Analytic Functionsquery-acceleration · performance
Batch Loaddata-loading · bulk-ingestion
Binlog / Table Streamcdc · data-streaming
BM25 Relevance ScoringApache Doris ranks full-text search results by BM25 relevance, so SQL queries can sort matches the way a search engine would.search · ranking
Catalog Integrationslakehouse · iceberg
Columnar StorageDoris's on-disk segment format. Per-column encoding and per-page compression shrink the table; per-page indexes let queries skip unread pages.storage · performance
Compute Groupresource-management · storage-compute-decoupled
Condition Cachequery-acceleration · caching
Data Cache & Page Cachelakehouse · caching
Data Compactionstorage · performance
Data LineageColumn-level lineage extracted from the Nereids plan and shipped to your governance tool through a pluggable SPI, plus table-level traces in the audit log.governance · observability
Data Modelstorage · table-design
Data Pruningquery-acceleration · performance
Data Update and Deleteupdate · delete
Embeddingai · search
Full-text Searchsearch · sql
Group CommitMerge many small INSERTs and Stream Loads server-side into one transaction, with sync and async modes for the latency vs throughput tradeoff.load · ingest
High-Concurrency Point Queryquery-acceleration · performance
Hybrid Searchsearch · ai
Iceberglakehouse · iceberg
Incremental Materialized Viewmaterialized-view · performance
Inverted Indexquery-acceleration · search
Kafka and CDC Integrationload · streaming
LLM SQL FunctionsAI_* SQL functions in Doris send column values to an LLM and return the result inline, so classification, extraction, and summarization stay in SQL.ai · analytics
Load Transactionsload · streaming
Managing Lake TablesDoris writes and manages Iceberg, Hive, and Paimon tables through SQL. CREATE, INSERT, UPDATE, DELETE, schema and partition evolution, branches, snapshots.lakehouse · iceberg
MCP Serverai · integration
Metadata Cachelakehouse · caching
MPP Architecturequery-acceleration · performance
Multi Cataloglakehouse · federation
Parquet Reader OptimizationA native C++ vectorized Parquet reader that prunes row groups and pages, decodes dictionaries directly, and reads payloads only after filters.lakehouse · query-acceleration
Partitioning and Bucketingtable-design · performance
Pipeline Execution Enginequery-acceleration · performance
Pluggable Authentication and Authorizationsecurity · authentication
Preaggregation and Rollupstorage · performance
Prepared Statementquery-acceleration · performance
Query CacheA pipeline-level cache that stores intermediate aggregation results at tablet granularity, so queries that share tablets reuse work.query-acceleration · caching
Reciprocal Rank Fusionsearch · ai
Resource Groupresource-management · multi-tenancy
Spill to Diskquery-engine · memory-management
Stream Loaddata-loading · real-time-ingest
Unique Keytable-design · upsert
VARIANT Data Typedata-types · semi-structured
Vector IndexA native ANN index on ARRAY<FLOAT> columns. Build it inside a Doris table and run millisecond TopN vector search next to your SQL analytics.search · ai
Vectorized Executionquery-acceleration · performance
Vertical CompactionColumn-group-based compaction that keeps memory bounded when Doris merges rowsets on wide tables.storage · performance
Workload Groupresource-management · multi-tenancy

Everything you need to knowabout Apache Doris

Everything you need to know
about Apache Doris