June 25, 2026tutorials10 min read

Lexicon: 30 Distributed-Database Terms (2026)

Distributed-database research papers are treasure maps drawn in jargon. Without a shared vocabulary, the same term can mean different things to different systems—CAP theorem’s “consistency” is not the same in Dynamo’s eventually consistent world as it is in Spanner’s linearizability. A lexicon is the Rosetta Stone that lets you decode trade-offs in sharding strategies, consensus algorithms, and storage layouts without having to reverse-engineer every paper from first principles. It also prevents the “I thought we agreed on ACID but you shipped BASE” surprises that keep on-call engineers awake at 3 a.m.

In 2026, the velocity of distributed-database innovation shows no sign of slowing. New vector stores power AI applications that must index billions of embeddings in milliseconds, while RAG pipelines demand exactly-once semantics across heterogeneous storage engines. Whether you are reading a decade-old classic like Google Bigtable or a fresh preprint on hybrid logical clocks, knowing the precise meaning of “consensus,” “shard,” or “LSM tree” saves you days of context-building. Think of this lexicon as your pocket field guide: when the paper says “read repair,” you’ll know it’s not plumbing.

How to use this lexicon

Treat the entries as quick-reference flash cards. Each term is numbered and grouped by theme so you can jump straight to the section that matches your current reading. Definitions are concise (≈45 words) and include internal links to canonical papers so you can dive deeper in one click. Bookmark the page and come back whenever a paper throws you a term you haven’t seen since last quarter’s on-call war-room.

Foundations

1. ACID

ACID stands for Atomicity, Consistency, Isolation, Durability—the four properties that guarantee correct transactions even under failure. Atomicity ensures all operations in a transaction succeed or fail as a unit; Consistency preserves database invariants; Isolation prevents interference between concurrent transactions; Durability guarantees that once committed, changes survive power loss. Systems like PostgreSQL implement ACID via MVCC and write-ahead logging.

2. BASE

BASE (Basically Available, Soft state, Eventual consistency) is the philosophical opposite of ACID, embraced by distributed systems such as Amazon Dynamo. It prioritizes availability and partition tolerance over immediate consistency, promising that if no new updates are made, eventually all replicas will converge to the same state. BASE trades strong guarantees for scalability and low latency in web-scale workloads. The formal contrast between BASE and ACID is analyzed in the BASE vs. ACID paper, which explains why web-scale systems choose soft-state availability over strict consistency guarantees.

3. Sharding

Sharding splits a dataset into smaller, horizontal partitions called shards, each managed by a separate node or replica set. Queries are routed to the correct shard via a shard key, reducing contention and enabling horizontal scaling. Systems like MongoDB and Cassandra let you choose keys that balance load while avoiding hotspots that can undermine the benefits of sharding.

4. Replication

Replication copies data across multiple nodes so that if one fails, others can serve reads and recover the lost data. There are three main flavors: single-leader (leader–follower), multi-leader, and leaderless (Dynamo-style). Replication improves availability and durability but introduces complexity around consistency, conflict resolution, and anti-entropy.

5. Quorum

A quorum is the minimum number of replicas that must acknowledge a read or write for the operation to succeed. In a system with N replicas, a typical quorum is N/2 + 1. Quorums help balance availability and consistency: larger quorums increase consistency at the cost of higher latency or lower availability during partitions. Dynamo and Cassandra let you tune quorum sizes via “R” and “W” parameters.

6. Tunable consistency

Tunable consistency allows clients to choose between stronger or weaker guarantees at request time. For example, Cassandra offers ONE, QUORUM, and ALL consistency levels for reads and writes. Stronger levels cost more latency; weaker levels improve performance but may return stale data. The technique reconciles the CAP theorem by letting operators trade consistency for latency on a per-query basis.

Knowledge graph of thirty interconnected concepts

## Consistency models

7. Eventual consistency

Eventual consistency guarantees that if no new updates are made to a data item, eventually all accesses will return the last updated value. It relaxes the timing of when replicas converge, which improves availability and partition tolerance. Dynamo and many NoSQL stores implement eventual consistency via vector clocks and gossip-based anti-entropy.

8. Strong consistency

Strong consistency demands that every read returns the most recent write or an error. It implies linearizability: once a write completes, all subsequent reads see that write, regardless of node or network latency. Systems like Spanner and Calvin achieve strong consistency through distributed consensus or deterministic scheduling, at the cost of higher latency and lower throughput during partitions.

9. Linearizability

Linearizability is a consistency model that makes a shared object behave like a single, indivisible one: once a write completes, all subsequent reads see that write, and operations appear to take effect instantaneously at some point between invocation and response. It is stronger than sequential consistency because it respects real-time ordering. Papers such as “Linearizability: A Correctness Condition for Concurrent Objects” formalize the concept.

10. Serializability

Serializability ensures that the outcome of concurrently executing transactions is equivalent to some serial execution of those transactions. It prevents anomalies like dirty reads and non-repeatable reads. Many RDBMSs implement serializability through locking or MVCC, while distributed systems often relax it to snapshot isolation or eventual consistency to improve scalability.

11. Snapshot isolation

Snapshot isolation guarantees that each transaction sees a consistent snapshot of the database as of the time it started, eliminating read–write conflicts. Writers do not block readers, and readers never see partial or uncommitted changes, but write–write conflicts are detected at commit time. PostgreSQL and Oracle use snapshot isolation via MVCC to balance performance and consistency.

12. Causal consistency

Causal consistency preserves the “happens-before” relationships of operations: if operation A causally precedes operation B, all replicas that see B must also see A. It is weaker than linearizability but stronger than eventual consistency, allowing concurrent updates without ordering constraints. Systems like COPS and Orbe implement causal consistency via vector clocks or version vectors.

Consensus and time

Reading these definitions alongside the original Paxos Made Simple paper makes the trade-offs concrete; for hands-on examples in 2026 stacks, the open-source LLM installation guide for Ollama, Mistral and Llama shows where vector storage meets consensus replicas.

13. Paxos

Paxos is a family of protocols for solving consensus in an asynchronous network with unreliable processes. It elects a leader and ensures that a single value is chosen even when messages are delayed or lost. The classic paper “Paxos Made Simple” breaks the protocol into roles—proposers, acceptors, learners—and shows how a majority quorum can agree on a single decision.

14. Raft

Raft is a consensus algorithm designed to be understandable and practical. It decomposes consensus into leader election, log replication, and safety rules that guarantee a strong leader and a totally ordered log. Raft’s use of randomized election timeouts and explicit terms reduces split-brain scenarios and makes it easier to reason about than Paxos. etcd and Consul are built on Raft.

15. Vector clock

A vector clock is a mechanism for capturing partial ordering of events across multiple processes. Each process maintains a vector whose i-th entry is the number of events it knows about from process i. By comparing vectors, processes can determine causality: if V(a) ≤ V(b), then a happened before b. Vector clocks are used in Dynamo-style stores to resolve conflicts during eventual consistency.

16. Hybrid logical clock

A hybrid logical clock combines physical timestamps with logical counters to track both real time and causal order. It assigns globally unique timestamps that are monotonic in both real time and causality, reducing the need for explicit synchronization and improving conflict resolution. Hybrid logical clocks are used in systems like CockroachDB to merge real-time and causal ordering.

17. Byzantine fault tolerance

Byzantine fault tolerance (BFT) enables a system to reach consensus even when some nodes exhibit arbitrary, potentially malicious behavior. The “Byzantine Generals Problem” formalizes this challenge: loyal generals must agree on a common plan despite traitors spreading conflicting information. Practical BFT systems like PBFT and HotStuff require a super-majority of honest nodes and cryptographic signatures to guarantee safety.

18. Leader election

Leader election is the process by which a distributed system selects a single node to coordinate writes, serve as the sequencer, or act as the primary replica. Algorithms like Raft and Paxos embed leader election via randomized timeouts and heartbeat protocols. A stable leader reduces log divergence and simplifies conflict resolution, but leader failures can trigger costly elections that temporarily degrade performance.

Five horizontal conceptual layers from foundations to AI era

## Storage engines

19. B-tree

A B-tree is a self-balancing tree data structure that keeps sorted data and allows searches, sequential access, insertions, and deletions in O(log n) time. B-trees are the workhorse of relational databases because they minimize disk I/O by storing many keys per node and maintaining high fan-out. PostgreSQL and MySQL use B-trees for primary indexes and secondary indexes alike.

20. LSM tree

A log-structured merge tree (LSM tree) buffers writes in memory (C0) and periodically flushes them to immutable on-disk SSTables (C1, C2, …). LSM trees optimize for high write throughput by transforming random writes into sequential ones and by deferring compaction to background threads. RocksDB, Cassandra, and Google Bigtable rely on LSM trees to scale writes without sacrificing read performance.

21. MVCC

Multi-version concurrency control (MVCC) maintains multiple versions of each row so that readers never block writers and writers never block readers. Each transaction sees a snapshot consistent with its start time, eliminating read–write conflicts. PostgreSQL, MySQL (InnoDB), and FoundationDB implement MVCC to provide snapshot isolation and reduce locking overhead.

22. Write-ahead log

A write-ahead log (WAL) records every change to a database before it is applied to the main data structures. In the event of a crash, the log is replayed to restore the database to a consistent state. WALs are the backbone of durability in systems like PostgreSQL, RocksDB, and Cassandra, and they also serve as the replication stream for followers.

23. Bloom filter

A Bloom filter is a space-efficient probabilistic data structure that tests whether an element is a member of a set. It can return false positives but never false negatives. Storage engines such as Apache Cassandra use Bloom filters to quickly skip SSTables that cannot contain a requested key, reducing disk I/O during lookups.

24. Anti-entropy (Merkle tree)

Anti-entropy protocols repair inconsistencies between replicas by exchanging and merging differences. A Merkle tree is a hash tree that summarizes the contents of a replica; replicas exchange Merkle hashes to detect mismatches and then transfer only the differing data. Dynamo and Cassandra use Merkle-tree anti-entropy to maintain eventual consistency across large clusters.

Modern and AI era

25. NewSQL

NewSQL refers to distributed, relational database systems that scale horizontally while preserving ACID semantics. Unlike traditional sharded SQL databases, NewSQL systems like Google Spanner, CockroachDB, and YugabyteDB use consensus protocols and synchronous replication to provide strong consistency and SQL interfaces without sacrificing performance. They target OLTP workloads that once required NoSQL trade-offs.

26. Vector database

A vector database stores and indexes high-dimensional vectors (embeddings) so that similarity search can be executed efficiently. It supports operations like k-nearest neighbors (k-NN) and range queries over embeddings generated by deep learning models. Vector databases such as Milvus, Pinecone, and Weaviate power modern AI applications that need semantic search over text, images, or audio.

27. HNSW index

Hierarchical navigable small world (HNSW) is a graph-based indexing technique for approximate nearest neighbor search in high-dimensional spaces. It builds a multi-layer graph where each layer is a small-world network; higher layers act as “express lanes” that guide search quickly toward the target region. HNSW indexes are used in vector databases like Milvus and Vespa to achieve sub-millisecond query latency at billion-scale.

28. RAG

Retrieval-augmented generation (RAG) augments large language models (LLMs) with real-time, factual context retrieved from a vector database or document store. Instead of relying solely on pre-trained knowledge, RAG systems query external knowledge bases to ground responses, reducing hallucinations and improving accuracy. Production pipelines integrate RAG with embeddings, vector databases, and prompt orchestration.

29. pgvector

pgvector is an open-source PostgreSQL extension that adds vector search capabilities to the relational database. It stores embeddings as PostgreSQL arrays and implements indexing strategies like HNSW and IVFFlat directly inside the database. With pgvector, teams can unify transactional and vector workloads on a single system, simplifying data pipelines and reducing operational overhead.

30. Embedding store

An embedding store is a specialized repository for storing, indexing, and retrieving high-dimensional vector embeddings produced by machine learning models. It supports similarity search, filtering, and metadata lookup, and often integrates with vector databases and RAG pipelines. Embedding stores abstract away the complexity of vector search so that applications can focus on semantics rather than indexing algorithms.

How these terms connect

Together these terms form a lattice of trade-offs: ACID and BASE frame the consistency–availability spectrum; consensus algorithms like Paxos and Raft move us from eventual to strong consistency; storage engines like LSM trees and B-trees translate those guarantees into disk layouts; and modern vector databases bridge databases with AI workloads. Whether you’re tuning quorums in Cassandra, evaluating a NewSQL system, or building a RAG pipeline with pgvector, the same vocabulary helps you navigate latency, correctness, and cost.

See also: top 12 NoSQL papers

See also: vector databases and RAG