External Orchestration Is Overcomplicated
Durable workflows checkpoint program state to a database for crash recovery. Most implementations—Temporal, Airflow, AWS Step Functions—use a central orchestrator. The orchestrator coordinates workers, checkpoints step outputs, and handles failures. DBOS argues this is unnecessary complexity.
> “If durable workflows are about databases, then there’s no reason to have a separate orchestrator server.”
Instead, use the database itself as the orchestrator.
Postgres as Orchestrator
In DBOS’s model, application servers talk directly to Postgres. A client creates a workflow entry in a Postgres table. Servers poll the table and dequeue workflows using locking clauses (e.g., SELECT ... FOR UPDATE SKIP LOCKED). Each server checkpoints step outputs back to Postgres. If a server crashes, another picks up the workflow from its last checkpoint.
Database integrity constraints prevent duplicate execution. If two workers claim the same workflow, the second will detect a conflict when checkpointing and back off.
Scalability and Availability Are Postgres Problems
Horizontal scaling means adding more worker servers. Maximum throughput is bounded by how fast Postgres can process workflows. A single Postgres instance can handle tens of thousands of workflows per second. For more, use distributed Postgres (CockroachDB) or sharding.
Availability is Postgres availability. Workers are fungible—any can recover any workflow. Postgres streaming replication with automatic failover or managed multi-AZ deployments (e.g., RDS, Cloud SQL) provide high-availability SLAs out of the box.
> “The decades of engineering work and research that have gone into operating Postgres at scale can translate directly to operating durable workflows.”
Built-in Observability via SQL
Workflow checkpoints live in Postgres tables. You can query them directly:
SELECT * FROM workflow_steps
WHERE status = 'error'
AND created_at > now() - interval '1 month';
This is possible because Postgres’s relational model and query optimizer support complex filtering. External orchestrators often use key-value stores that lack such analytical capabilities. With secondary indexes, you get efficient observability “for free.”
Fewer Points of Failure
External orchestrators introduce two single points of failure: the orchestrator server and its data store. If either goes down, the whole system is unavailable. Workflow data also transits through the orchestrator, expanding the attack surface.
With Postgres-backed durable execution, the only point of failure is Postgres itself. Workflow data never leaves the database. If you already use Postgres, you add no new critical infrastructure.
Concrete Example: Dequeueing Workflows
Here’s a simplified pattern for workers to dequeue workflows from Postgres:
-- Worker picks next available workflow atomically
UPDATE workflows
SET status = 'running', worker_id = 'worker-1'
WHERE id = (
SELECT id FROM workflows
WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING *;
FOR UPDATE SKIP LOCKED ensures each workflow is claimed by exactly one worker without blocking others.
Why This Matters Now
Durable execution is gaining traction for building reliable distributed systems. Temporal’s popularity shows demand, but its complexity (separate service, custom SDKs) can be overkill. DBOS’s approach lets teams with Postgres experience adopt durable workflows without learning new infrastructure. It also reduces operational burden—one fewer system to monitor and secure.
What DBOS Offers
DBOS provides an open-source library and managed service for Postgres-backed durable execution.
- Quickstart: https://docs.dbos.dev/quickstart
- GitHub: https://github.com/dbos-inc
- Discord: https://discord.gg/eMUHrvbu67
The Verdict
This is a compelling pattern for teams already invested in Postgres. It trades the flexibility of a dedicated orchestrator for operational simplicity. For many applications, that’s a good trade.






