Writing Claude.md – Making Claude 4.x Behave Like an Engineering System
The traditional LLM usage pattern is Prompt → Output. Claude 4.x operates closer to Workflow → Tooling → State → Verification → Iteration.
Claude.md is not a prompt—it's a Work Contract. It defines how the model should plan, execute, collaborate, maintain state, and verify outcomes.
Below are practical observations and the official Claude 4.x best practices, with examples, pitfalls, and prompt patterns. The notes come from sustained use of Claude Code (Anthropic's coding CLI built on Sonnet 4.5 and Opus 4.5) on the OpenClaw project across 2025–2026.
1. Claude.md: From Prompt to Workflow
Claude 4.x shows significant improvements in:
- Long-horizon reasoning
- Multi-step planning
- State tracking
- Context awareness
- Subagent orchestration
- Verification via tests
- Tool-assisted execution
This means:
Without a work contract, the model uses its own strategies With a contract, the model follows your defined process
Claude.md is that contract.
2. Claude.md Structure
Recommended skeleton:
PURPOSE
WORKFLOW
STATE MANAGEMENT
TESTING / VERIFICATION
TOOL POLICY
SUBAGENT CONTRACTS (optional)
COMPLETION CRITERIA
RISK / FAILURE MODES
Not every section is required, but this structure produces consistent behavior.
3. Workflow Design and Multi-Window Strategy
Claude doesn't work like traditional ChatGPT-style "generate everything at once." It breaks tasks into phases (windows):
Typical pattern:
- Build architecture or plan (Plan)
- Tests as correctness oracle (Tests)
- Minimal implementation
- Verification
- Refactor + Docs
- Commit state
Key characteristics:
- Tests are immutable
- Tests serve as correctness oracle
- State externalized (preferably in filesystem / git)
- Reset window beats stuffing summaries
- External memory > Token memory
4. State Management
Official best practices provide clear direction:
- Structured state (JSON / YAML)
- Notes (Markdown)
- Git for timeline & diff
- Immutable tests
- Externalize progress
Claude uses these states to recover context without relying on token memory, avoiding hallucination. For runtime checks beyond unit tests, complement this with verification through tracing in production.
5. Subagents and Tooling
Claude can autonomously determine:
- What needs delegating
- What needs parallel calls
- What suits Explorer vs Planner vs Executor
With a subagent contract:
Explorer: read files, summarize structure
Planner: create task breakdown
Coder: apply changes
Tester: verify via tests
Collaboration is far more stable than monolithic prompting.
6. Claude.md Example (Engineering Practice)
A simplified example showing how to influence behavior:
WORKFLOW
1. Build architectural plan before coding.
2. Write tests as correctness oracle. Tests are immutable.
3. Implement minimal version.
4. Verify using tests.
5. Refactor for readability + maintainability.
6. Produce documentation on design choices.
STATE
- Use JSON for machine-readable state.
- Use Markdown for progress notes.
- Use git for diff + timeline.
COMPLETION
- All tests pass.
- Code readable.
- No regressions.
- States committed.
Even this short section will influence Claude's behavior.
7. Common Failure Modes and Mitigation Strategies
Practical observations of where Claude struggles, with prompt patterns to address them.
(1) Over-optimizing to Pass Tests
Symptoms:
- Claude hacks tests
- Uses helper scripts or bypass tools
- Hard-codes edge cases
- Over-optimizes correctness at the cost of maintainability
Suggested prompt:
Do not optimize for passing tests. Optimize for correctness, maintainability, and readability.
Use standard tools. Avoid custom helper scripts unless explicitly requested.
(2) Over-engineering
Opus is particularly prone to:
- Excessive abstraction
- Too many interfaces
- Introducing 3 helper modules for a trivial feature
Prompt:
Prefer simplicity. Avoid unnecessary abstractions.
Optimize for clarity over flexibility.
(3) Answering Without Reading Code
Claude sometimes skips the reading step but answers confidently.
Prompt:
Before answering, read the relevant code files using the explorer.
Summarize what you read before proposing changes.
(4) Hallucinated APIs / Modules
In coding, hallucination commonly occurs when:
- Not accessing the filesystem
- Insufficient context
- "Try-to-be-helpful" speculation
Prompt:
If information is missing or ambiguous, ask before assuming.
Do not invent functions or modules.
(5) Testing Shortcuts
Claude takes shortcuts to pass tests rather than following proper engineering workflows.
Example prompt:
Use standard tools and test frameworks only.
Do not bypass verification using helper scripts or custom stubs.
Do not hard-code to satisfy tests.
8. The Improvement Case: From Chatbot to System
Claude's greatest capability is not:
"Generating a text response"
But rather:
"Maintaining tasks → Tracking state → Verifying → Correcting → Completing → Closing"
Claude.md helps surface this capability.
9. Summary
Claude.md is essentially:
- Workflow spec
- Execution contract
- Verification guide
- State policy
- Tooling interface
- Completion definition
Claude 4.x is among the first LLMs with genuine:
- Multi-window
- State-aware
- Test-driven
- Agentic coding
capabilities—which is why it needs an "instruction manual." Claude.md is that manual.
Related Posts
- Multi-Context Workflows and State Management – Managing state across sessions
- Agent Skills Best Practice – Designing skills that agents can reliably select and execute
- Introduction to Claude Code – Getting started with Claude Code CLI