Codex Logging Bug Writes 37 TB to SSD in 21 Days

37 TB in 21 Days: Codex's Logging Bug Threatens SSDs

A bug in OpenCodex's SQLite feedback log is causing massive write amplification, with one user reporting 37 TB written to their SSD in just 21 days. That extrapolates to roughly 640 TB per year—enough to consume the full warranted write endurance of many consumer SSDs (typically 600 TBW) in under 12 months.

The issue, tracked on GitHub, has 14 upvotes and 3 comments, but the severity is hard to overstate. The problem lies in the default TRACE-level logging configuration for the SQLite feedback log sink.

Root Cause: Global TRACE Default

The culprit is a single line in the logging setup:

Targets::new().with_default(Level::TRACE)

This sets the default log level to TRACE for all targets, including internal dependency logs and raw protocol payloads. The result is that mundane events like inotify file-open notifications, tokio-tungstenite WebSocket stream operations, and OpenTelemetry mirror events are all persisted to ~/.codex/logs_2.sqlite.

According to the bug report, TRACE-level logs account for 70.7% of retained bytes in the database. The top offenders:

Target	Level	Estimated MiB
`codex_api::endpoint::responses_websocket`	TRACE	527.4
`codex_otel.log_only`	INFO	141.2
`codex_otel.trace_safe`	INFO	121.2
`log`	TRACE	97.4
`codex_client::transport`	TRACE	60.1
`codex_core::stream_events_utils`	DEBUG	27.5
`codex_api::sse::responses`	TRACE	19.1

Filtering out these categories would remove roughly 96% of retained log bytes, according to the reporter.

Write Amplification: Insert-and-Prune

The retained database size (about 1 GB after 21 days) doesn't tell the full story. In a 15-second sample, the reporter observed:

Retained rows: 681,774 (unchanged)
Max row ID: jumped from 5,003,347,015 to 5,003,383,226
That's 36,211 rows inserted in 15 seconds, while the retained count stayed flat

This indicates a continuous insert-and-prune cycle: rows are written, indexed, logged to the WAL, then pruned. The WAL file (logs_2.sqlite-wal) grows and shrinks, but the write amplification is enormous.

Sanitized Examples: What's Being Logged?

The most frequent TRACE events are shockingly low-value:

128,764x TRACE log: inotify event: ... mask: OPEN, name: Some("ld.so.cache")
37,982x TRACE log: inotify event: ... mask: OPEN, name: Some("locale.alias")
23,843x TRACE log: inotify event: ... mask: OPEN, name: Some("passwd")
3,639x TRACE log: /src/compat.rs:131 AllowStd.with_context
3,505x TRACE log: /src/lib.rs:245 WebSocketStream.with_context

These are raw internal library traces that provide zero diagnostic value for end users. The reporter notes that raw websocket/SSE payload bodies are also logged but intentionally not shown because they may contain private conversation content.

Proposed Fix: Better Default Filtering

The reporter suggests several concrete changes:

Don't use global TRACE for the SQLite feedback log sink. Instead, set a higher default (e.g., WARN or INFO) and selectively enable TRACE for specific targets.
Drop or raise thresholds for low-value dependency noise, especially target=log, hyper_util, tokio-tungstenite internals, inotify spam, and low-level OpenTelemetry SDK logs.
Avoid persisting full raw websocket/SSE payloads by default. Store summaries instead: event kind, duration, success/error, token usage, and payload byte length.
Avoid persisting mirrored codex_otel.log_only / codex_otel.trace_safe events unless they are explicitly useful for feedback debugging.
Add a global logs DB size/write cap. Per-thread caps are not enough when many threads/processes exist.
Provide an optional escape hatch such as sqlite_logs_enabled = false.

Related Issues: A Pattern of I/O Abuse

This isn't an isolated incident. The bug report links to several related issues:

#17320: Excessive SQLite WAL writes during streaming due to TRACE logs ignoring RUST_LOG
#24275: Codex Desktop rapidly grows logs_2.sqlite / WAL during normal active use
#26374: app-server: feedback log sqlite grows unbounded — ~0.75 GB/day, no retention/rotation
#22444: logs_2.sqlite-wal grows indefinitely and remains allocated after deletion
#20563: Heavy I/O activity from idle codex processes
#27020: Severe disk I/O / 100% disk active time on Windows WSL2
#27911: goals_1.sqlite write amplification: ~11 MB/s sustained writes (11 GB lifetime) on a 4 KB database
#21134: Codex Desktop becomes unusable on long active threads due to memory and TRACE log churn

What You Should Do

If you're running Codex, check your ~/.codex/logs_2.sqlite size and WAL write rate. On Linux, you can monitor disk writes per process with iotop or fatrace. If you see sustained writes above a few MB/s, you're affected.

Until a fix is released, you can mitigate by:

Setting the RUST_LOG environment variable to a higher level (e.g., RUST_LOG=warn). However, note that the SQLite sink may ignore this due to the global TRACE default.
Periodically deleting the SQLite files (but be aware that open processes may keep the WAL file alive).
Monitoring SSD wear with tools like smartctl.

The core fix—changing the default log level from TRACE to something sane—is straightforward. The OpenCodex team should prioritize this before users start burning through SSDs.

Codex Logging Bug Writes 37 TB to SSD in 21 Days

37 TB in 21 Days: Codex's Logging Bug Threatens SSDs

Root Cause: Global TRACE Default

Write Amplification: Insert-and-Prune

Sanitized Examples: What's Being Logged?

Proposed Fix: Better Default Filtering

Related Issues: A Pattern of I/O Abuse

What You Should Do

Editor's Take

Key Takeaways

Why It Matters

Get the weekly digest

You might also like

Deno 2.9 Ships Desktop: Web Apps as Native Binaries

CPU Cycle Costs: Divisions at 15 Cycles, Exceptions at 2700+

Recall: Fully-Local Project Memory for Claude Code

Bun's WebKit PR Adds Shared-Memory Threads to JavaScriptCore

Deno 2.9 Ships Desktop: Web Apps as Native Binaries

CPU Cycle Costs: Divisions at 15 Cycles, Exceptions at 2700+