The Problem: Partitioning by created_at Leaks Into Every Query

Partitioning by created_at forces the date column into the primary key (PRIMARY KEY (id, created_at)). That means every query that looks up by id alone now scans all partitions. The "fix" teams apply is to add created_at filters to every query — a leaky abstraction that becomes a code review checklist item.

-- Before partitioning: single index seek
SELECT * FROM orders WHERE id = 12345;

-- After partitioning by created_at: scans 36 partitions
SELECT * FROM orders WHERE id = 12345;

-- "Fixed" query with date filter: prunes to one partition
SELECT * FROM orders WHERE id = 12345 AND created_at >= '2026-03-01' AND created_at < '2026-04-01';

PostgreSQL and MySQL both require the partition key to be part of any unique constraint. So PRIMARY KEY (id) becomes PRIMARY KEY (id, created_at). Uniqueness on id is lost — two rows can have the same id with different timestamps. Query plans degrade: const lookups become ref lookups, and joins that were eq_ref become slower ref.

The Better Way: Partition by Primary Key

Partition by the primary key column (or any monotonically increasing key). For a table with BIGINT AUTO_INCREMENT, the id column is naturally ordered and commonly used in WHERE clauses.

CREATE TABLE orders (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  customer_id BIGINT NOT NULL,
  total_cents INT NOT NULL,
  created_at DATETIME NOT NULL
)
PARTITION BY RANGE (id) (
  PARTITION p0001 VALUES LESS THAN (100000000),
  PARTITION p0002 VALUES LESS THAN (200000000),
  PARTITION p0003 VALUES LESS THAN (300000000),
  PARTITION p_future VALUES LESS THAN MAXVALUE
);

Now every query that filters by id (which is most of them) gets partition pruning automatically. No application code changes needed. The primary key stays PRIMARY KEY (id), preserving uniqueness and optimal query plans.

Automating Partition Boundary Management

Static boundaries fail when growth projections change. The solution is a background service that monitors the active partition and splits the catch-all partition before it fills up.

-- Split the catch-all into a new bounded partition + new catch-all
ALTER TABLE orders REORGANIZE PARTITION p_future INTO (
  PARTITION p0037 VALUES LESS THAN (3700000000),
  PARTITION p_future VALUES LESS THAN MAXVALUE
);

If the catch-all is empty or small at split time, the operation is metadata-only and fast. The service must stay ahead of write workload, handle partial failures, and be idempotent on retry.

Handling Time-Based Retention Without a Date Key

Even though the partition key is id, boundaries can still align with time by mapping ID ranges to time intervals. For example, if 100M IDs correspond to roughly one month, you can set boundaries accordingly. The partitioner service can adjust boundaries dynamically based on observed growth rates.

Trade-offs and Considerations

  • Partitioning by id does not directly support time-based retention queries (e.g., "delete orders older than 90 days"). However, you can still drop old partitions by tracking which ID ranges correspond to past time periods.
  • The background service adds operational complexity: it must run DDL safely without locking out writes, survive crashes, and coordinate with other tooling.
  • For hash and list partitioning, a similar "watch and adjust" pattern applies, but with different operations (e.g., adding new partitions for new list values).

Bottom Line

Partitioning by a date column introduces coupling between storage and application code, degrades query performance, and loses uniqueness guarantees. Partitioning by the primary key keeps queries clean, preserves performance, and lets a background service manage boundaries dynamically. The operational overhead of the service is a worthwhile trade for avoiding the leaky abstraction.