Intent, verified.
Write what your system ought to do in plain English. Ought generates tests that verify it.
Plain markdown specs
Specs live as plain markdown — renders in GitHub, reviewable in a PR, readable in any editor. No DSL, no special syntax.
LLM-generated tests
Ought reads your spec and your source, then generates tests that enforce each clause. Change the sentence, regenerate the test.
Clause-mapped results
Every failure maps back to a specific clause in the spec. You know exactly what broke, and why.
Rich clause vocabulary
The keyword shapes the test. Severity (MUST/SHOULD/MAY/WONT), deadlines (MUST BY), invariants (MUST ALWAYS), fallbacks (OTHERWISE), preconditions (GIVEN).
Git-aware diagnostics
ought debug blame and ought debug bisect trace any failure back to the commit that broke it.
Spec and code, round-tripped
Specs flow both ways: ought generate writes tests from a spec; ought extract audits what's already specified, then drafts new specs for the gaps.
Plain markdown. RFC-style keywords.
Human-readable, version-controlled, and renders natively in GitHub. No DSL to learn — just sentences with MUST, SHOULD, and WONT.
# Booking Agent
context: Travel booking assistant via chat
source: src/agents/booking/
## Reservation flow
- **MUST** confirm dates and destination before searching
- **MUST** present at least 3 options with prices
- **MUST NOT** book without explicit user confirmation
- **SHOULD** remember user preferences across sessions
- **WONT** handle payment — hand off to checkout flow$ ought run
Booking Agent booking.ought.md
------------------------------------------------
Reservation flow
✓ MUST confirm dates and destination
✓ MUST present at least 3 options with prices
✓ MUST NOT book without explicit confirmation
✓ SHOULD remember user preferences
⊘ WONT handle payment (confirmed absent)
5 passed · 0 failed · 1 confirmed absent
MUST coverage: 3/3 (100%)Intent → test → result → intent.
Each clause becomes a test. Each test produces a result. Each result is mapped back to the originating clause. The loop closes automatically — no orphan failures, no untraceable passes.