Understanding Vitess Keyspace Partitioning Models

A Vitess keyspace is a logical database that may be spread across many physical shards, and the partitioning model is the function that decides which shard any given row lives on. Get that function wrong and every design decision downstream compounds it: hot-path reads scatter across the whole fleet, one shard absorbs a disproportionate share of writes, and a future resharding turns into a bespoke data migration instead of a mechanical copy. This page resolves the concrete decision that sits under all of that — which partitioning model (range, hash, lookup, or custom) a keyspace should use, how each one maps a sharding key to a physical shard, and how to declare and verify the choice in a VSchema. It is written for database platform engineers, MySQL SREs, and Python orchestration builders who own the shard map and the routing behaviour that flows from it. This is the upstream decision assumed settled by Designing Horizontal Shard Topologies; here we make it deliberately.

Prerequisites

Before committing to a partitioning model, confirm the following:

Vitess 18.0 or later for the vtctldclient ApplyVSchema surface and native Reshard v2 workflows used in the examples below. Earlier releases expose the same concepts through vtctlclient with different syntax.
A running topology server — etcd (v3) or Consul, reachable from every cell — holding the keyspace, shard, and VSchema records that the stateless VTGate routing layer reads to plan queries.
A VSchema you can edit and apply. The partitioning model is a VSchema construct; if the JSON structure of vindexes and column_vindexes is unfamiliar, work through Mastering VSchema Syntax and Structure first.
A candidate sharding key — the column whose value drives placement. It must have high cardinality and, for most models, uniform entropy; the read patterns it needs to keep single-shard should already be enumerated.
A shard-count target grounded in write QPS, dataset size, and per-shard IOPS headroom. The partitioning model and the shard count are chosen together — the arithmetic is worked through in How to Calculate Optimal Shard Count for MySQL.

Core Mechanism: How a Partitioning Model Resolves a Row to a Shard

Vitess does not store a “shard” column on any row. Placement is computed at query time, and every partitioning model is really a choice of one function — the primary vindex — applied to the sharding-key column.

The keyspace ID is the universal intermediate. Whatever the model, VTGate first turns the sharding-key value into an 8-byte keyspace ID by running it through the table’s primary vindex. Shards are not named buckets; each owns a contiguous key-range interval over that 8-byte space, written as a hexadecimal prefix pair — -80 means keyspace IDs from 0x00… up to (not including) 0x80…, and 80- means 0x80… to the maximum. A row belongs to whichever shard’s interval contains its keyspace ID. So the partitioning model never picks a shard directly; it only decides how the keyspace ID is derived, and the key ranges do the rest. This is why the same physical shard layout can serve any model — swapping the model swaps the vindex, not the shards.

The primary vindex is the model. Each model corresponds to a class of primary vindex:

Hash partitioning uses a hash (legacy null-key DES) or xxhash vindex. It scrambles the input into a uniformly distributed keyspace ID, so sequential or clustered input values land on unrelated shards. Uniform distribution is the point — it neutralises write hotspots from monotonic keys and spreads a high-ingest workload evenly. The cost is that any range predicate (WHERE created_at BETWEEN …) becomes a scatter, because adjacent values are deliberately scattered.
Range partitioning uses an order-preserving vindex such as numeric or binary, where the keyspace ID tracks the raw value’s ordering. Rows with nearby sharding-key values sit on the same or adjacent shards, so a bounded range scan touches one or two shards instead of all of them. It suits time-series, monotonically increasing identifiers, and archival tiering — at the cost of needing proactive shard splitting, because new writes concentrate on whichever shard owns the current high end of the range.
Lookup partitioning uses a lookup/consistent_lookup vindex backed by a MySQL mapping table that stores, for each sharding-key value, the keyspace ID it resolves to. It decouples placement from the value’s own entropy, which is what lets you route on a secondary column, co-locate rows that no single natural key would, or keep a foreign-key relationship on one shard. The tradeoff is an extra table to keep transactionally consistent with the data it maps — the full lifecycle is covered in Configuring Lookup Vindexes for Cross-Shard Joins.
Custom partitioning binds a bespoke vindex plugin (or a composite of existing ones) declared entirely in the VSchema, so platform teams can inject application-specific placement logic — a region prefix, a tenant-aware split, a multi-column function — without altering any MySQL table structure. It is the escape hatch when none of the built-in functions match the access pattern.

Powers of two make the model splittable. Because every shard boundary is a binary prefix, a shard bisects cleanly into two children that together cover exactly its old range (-80 → -40 and 40-80). Provisioning shard counts as powers of two means a future Reshard is a mechanical copy of one contiguous range, not a redistribution — and this holds regardless of which partitioning model produced the keyspace IDs. Odd counts force uneven ranges and turn every split into a hand-computed migration.

Choosing a Model

The decision is driven by the dominant access pattern, not by which model is fashionable. Match the model to how queries actually filter and how writes actually arrive.

Model	Keyspace ID derived by	Best fit	Primary risk
Hash	Hashing the sharding key (`hash`, `xxhash`)	High-ingest OLTP; monotonic or clustered keys that would otherwise hotspot; point lookups by the sharding key	Range scans on the sharding key scatter to every shard
Range	Order-preserving map of the raw value	Time-series, sequential IDs, bounded range scans, archival tiering	Writes concentrate on the top-of-range shard; needs proactive splits
Lookup	External mapping table	Routing on a secondary column; co-locating related rows; cross-shard join avoidance	Extra table to keep consistent; write amplification on inserts
Custom	VSchema-declared plugin/composite	Tenant-aware or region-aware placement no built-in vindex expresses	You own the correctness and the resharding story

Two heuristics resolve most cases. First, if writes arrive on a monotonically increasing key (an auto-increment ID, a timestamp) and you do not need range scans on it, choose hash — it is the single most reliable defence against a write hotspot. Second, if your hottest read filters on a column that is not the natural primary key, a lookup vindex on that column will keep the read single-shard where hashing the primary key would force a scatter. When the hot read and the hot write disagree on which column to route by, you generally shard by the write key and add a secondary lookup vindex for the read.

Step-by-Step: Declaring and Applying a Partitioning Model

The sequence below turns a chosen model into a live VSchema for a commerce keyspace. Each step is independently verifiable, so you can inspect state before proceeding.

1. Declare a hash-partitioned VSchema. The primary vindex is what makes a table routable — without one, VTGate scatters every query no matter how the shards are laid out. Bind the sharding-key column of each table to a shared xxhash vindex:

{
  "sharded": true,
  "vindexes": {
    "xxhash": { "type": "xxhash" }
  },
  "tables": {
    "orders": {
      "column_vindexes": [
        { "column": "customer_id", "name": "xxhash" }
      ]
    },
    "customers": {
      "column_vindexes": [
        { "column": "id", "name": "xxhash" }
      ]
    }
  }
}

Sharding orders by customer_id (not order_id) co-locates a customer’s entire order history on one shard, so the common “fetch this customer’s orders” read stays single-shard even under a hash model. Apply it:

vtctldclient ApplyVSchema --vschema-file commerce.vschema.json commerce

2. Add a secondary lookup vindex for the off-key read path. Suppose orders are also queried by order_ref (an external reference customers quote to support). Hashing customer_id scatters that lookup across every shard. A consistent_lookup_unique vindex keeps it single-shard by maintaining a mapping table from order_ref to keyspace ID:

{
  "sharded": true,
  "vindexes": {
    "xxhash": { "type": "xxhash" },
    "order_ref_lookup": {
      "type": "consistent_lookup_unique",
      "params": {
        "table": "commerce.order_ref_idx",
        "from": "order_ref",
        "to": "keyspace_id"
      },
      "owner": "orders"
    }
  },
  "tables": {
    "orders": {
      "column_vindexes": [
        { "column": "customer_id", "name": "xxhash" },
        { "column": "order_ref", "name": "order_ref_lookup" }
      ]
    }
  }
}

The first column_vindexes entry remains the primary (it computes the keyspace ID); the second is a secondary index Vitess populates and consults. Marking orders as the owner lets Vitess keep order_ref_idx transactionally consistent as rows are written.

3. Model a range-partitioned keyspace instead. For an append-heavy events keyspace queried by bounded time windows, an order-preserving vindex keeps a window on one or two shards. The raw value must already be well-distributed for entropy, so range models often shard on a pre-bucketed key:

{
  "sharded": true,
  "vindexes": {
    "event_range": { "type": "numeric" }
  },
  "tables": {
    "events": {
      "column_vindexes": [
        { "column": "event_bucket", "name": "numeric" }
      ]
    }
  }
}

4. Verify the plan before creating any shards. vtexplain resolves a query against the VSchema offline and shows whether it routes to one shard or scatters — catch a mis-modelled table here, not in production:

vtexplain --vschema-file commerce.vschema.json --schema-file schema.sql \
  --shards 4 --sql "SELECT * FROM orders WHERE customer_id = 42"

A single-shard Route confirms the primary vindex is doing its job; a Scatter means the predicate column is not the routing column.

5. Provision the shards and move data in. Create the key ranges, then use a VReplication workflow to populate them online. Because keyspace IDs are prefix-aligned, the copy is mechanical:

for range in -40 40-80 80-c0 c0-; do
  vtctldclient CreateShard commerce/$range
done

vtctldclient MoveTables --workflow load --target-keyspace commerce create \
  --source-keyspace commerce_unsharded --tables 'orders,customers'

6. Drive validation from automation idempotently. Orchestration controllers should confirm a table’s primary vindex resolves before promoting a VSchema, so a bad model never reaches production. The control plane is spoken over vtctldclient / the vtadmin API, not the SQL port:

import json, subprocess

def primary_vindex(keyspace: str, table: str) -> str | None:
    out = subprocess.run(
        ["vtctldclient", "GetVSchema", keyspace],
        capture_output=True, text=True, check=True,
    )
    vschema = json.loads(out.stdout)
    cvs = vschema["tables"].get(table, {}).get("column_vindexes", [])
    # The first column_vindexes entry is the primary vindex that computes the keyspace ID.
    return cvs[0]["name"] if cvs else None

def assert_routable(keyspace: str, tables: list[str]) -> None:
    missing = [t for t in tables if primary_vindex(keyspace, t) is None]
    if missing:
        raise SystemExit(f"tables scatter every query — no primary vindex: {missing}")

Vindex Reference

The vindexes below are the ones you will actually reach for when implementing each model. Unique indicates the vindex maps each input to a single keyspace ID (targetable); functional vindexes compute the ID, lookup vindexes read it from a table.

Vindex	Model	Kind	Unique	Notes / recommended use
`xxhash`	Hash	functional	yes	Default primary vindex for new keyspaces — fast, uniform, replaces legacy `hash`
`hash`	Hash	functional	yes	Legacy null-key DES hash; keep only for compatibility with existing keyspaces
`numeric`	Range	functional	yes	Order-preserving passthrough for integer keys; requires externally uniform values
`binary`	Range	functional	yes	Order-preserving over raw bytes; for pre-hashed or opaque ordered keys
`consistent_lookup_unique`	Lookup	lookup	yes	Secondary routing on an off-key column; transactionally consistent mapping table
`consistent_lookup`	Lookup	lookup	no	Non-unique variant for one-to-many mappings; returns a keyspace-ID set
`lookup_unique`	Lookup	lookup	yes	Older lookup with looser consistency; prefer `consistent_lookup_unique` in new work
(plugin)	Custom	functional	varies	Bespoke vindex registered in the build; you own correctness and split behaviour

Two rules govern the table. First, every sharded table needs exactly one primary vindex (the first column_vindexes entry) and it must be unique and functional — a lookup vindex cannot be primary because it would need the row to exist before it could be placed. Second, a lookup vindex must declare an owner table so Vitess keeps its mapping table consistent on writes; an unowned lookup silently drifts.

Failure Modes Specific to Partitioning

Silent scatter from a missing primary vindex. Root cause: a table listed in a sharded VSchema with no column_vindexes entry, so VTGate cannot compute a keyspace ID and fans every query to all shards. Symptoms: p99 latency scales with shard count; the single-shard hit-rate metric collapses; --warn_sharded_only logs the offending table. Mitigation: require a primary vindex on every routable table before shards go live; enforce the check from step 6 in the deploy pipeline and gate on vtexplain.

Hotspot shard from a monotonic key under a range model. Root cause: a range vindex over an auto-increment ID or timestamp, so all new writes target the shard owning the current high end. Symptoms: one shard’s Threads_running and write latency climb while peers idle; scatter latency is dominated by that shard. Mitigation: switch the write key to a hash vindex if range scans are not required; if range semantics are mandatory, pre-bucket the key and split the hot shard ahead of demand.

Skewed distribution from a low-cardinality hash key. Root cause: hashing a column with few distinct values (a country code, a status enum), so keyspace IDs cluster and a few shards own most rows. Symptoms: per-shard dataset size and QPS diverge despite a hash model; vtexplain still shows targeted routes but load is lopsided. Mitigation: choose a higher-cardinality primary vindex column; validate key entropy with a histogram on production traffic before finalising the model.

Lookup vindex drift. Root cause: a lookup vindex with no owner, or an owner set on the wrong table, so the mapping table is not maintained inside the same transaction as the data. Symptoms: off-key reads miss rows that exist, or route to a stale shard after a row moves; the mapping table row count diverges from the base table. Mitigation: always declare owner; prefer consistent_lookup* variants; reconcile the mapping table periodically and after any manual data move.

Model change forcing a full redistribution. Root cause: switching primary vindex (hash → range, or changing the hash column) on a populated keyspace, which changes every row’s keyspace ID. Symptoms: Reshard cannot bisect cleanly; the workflow copies essentially the whole dataset across shard boundaries. Mitigation: treat the primary vindex as near-permanent; validate the model against real access patterns before load, and when a change is unavoidable run it as an explicit MoveTables into a freshly modelled keyspace rather than an in-place reshard.

Verification

Confirm the model routes as designed before declaring the keyspace production-ready.

Hot-path reads resolve to one shard. Run the dominant queries through vtexplain and confirm each produces a Route, not a Scatter:

vtexplain --vschema-file commerce.vschema.json --schema-file schema.sql \
  --shards 4 --sql "SELECT * FROM orders WHERE order_ref = 'R-88213'"

With the secondary lookup vindex in place this should target a single shard; without it, the same query scatters — the diff is the whole value of the model.

Distribution is even. Watch per-shard mysql.global.status.bytes and VTGate QueriesRouted broken out by shard. A well-modelled keyspace shows dataset size and write volume within a narrow band; a persistent outlier is an early signal of a skewed key or a mis-chosen model, long before it becomes a latency page.

Lookup mapping is consistent. For any lookup vindex, the mapping table’s row count should track the owner table’s, and a spot-checked value should resolve to the shard that actually holds the row:

SELECT COUNT(*) FROM order_ref_idx;   -- compare against COUNT(*) FROM orders

Cross-shard reads that a model cannot keep single-shard escalate into distributed transactions, and schema changes across a partitioned keyspace must be sequenced through Online DDL orchestration — both are costs the partitioning model exists to minimise.

How to Calculate Optimal Shard Count for MySQL — sizing the shard count that a partitioning model is applied over.
Designing Horizontal Shard Topologies — turning a chosen model into a physical shard layout across failure domains.
Mastering VSchema Syntax and Structure — the JSON grammar for keyspaces, vindexes, and column_vindexes.
Configuring Lookup Vindexes for Cross-Shard Joins — the lifecycle of the mapping tables that back lookup partitioning.
VTGate Routing Architecture Deep Dive — how the router turns a keyspace ID into a targeted or scatter plan.

← Back to Vitess Sharding Architecture & Topology Design

Understanding Vitess Keyspace Partitioning Models

Prerequisites #

Core Mechanism: How a Partitioning Model Resolves a Row to a Shard #

Choosing a Model #

Step-by-Step: Declaring and Applying a Partitioning Model #

Vindex Reference #

Failure Modes Specific to Partitioning #

Verification #

Related #

Go deeper

Related in Sharding Architecture & Topology