Agentic coding,
the sane way
What actually works when you pair with AI agents every day. No hype, no magic numbers. The practices that produce better code with less friction.
The speed trap
There is a version of this pitch where agents make you 10 or 20 times faster. That version is wrong. It conflates typing speed with engineering velocity. The bottleneck in software was never how fast you type. It was always understanding the problem, designing a solution, verifying correctness, and communicating with your team.
Agents can meaningfully reduce the cost of one thing: turning a clear thought into working code. That is valuable when it works well. But the thought still has to be clear. The code still has to be correct. And you still have to understand what you shipped.
The realistic, sustainable gain varies by context. For well scoped, routine tasks it can approach 2 to 3 times your previous speed. For novel or complex work, the gain may be modest or even negative if you skip review. Where does the benefit come from, when it appears?
Zero communication latency
When your pair partner is the agent, you skip the Slack message, the context transfer, the "let me explain the codebase" overhead. The feedback loop shrinks from hours to seconds.
Mechanical work offloaded
CRUD endpoints, type definitions, test boilerplate, config files. The code you know exactly what it should look like but still takes 20 minutes to type. That drops to seconds.
Exploration becomes cheap
Instead of committing to one approach, you can ask the agent to sketch three. Comparing concrete implementations is faster and more honest than debating abstractions in your head.
The engineers who chase 20× end up with codebases nobody can reason about, including themselves. Six months later they are slower than when they started because every change triggers unexpected breakage in code they never actually understood. Greed is the enemy here. Aim to be genuinely, measurably, sustainably faster. That is enough.
Three axioms
Every technique in this guide follows from these. When in doubt, return here.
Context is finite and precious
Every agent has a context window. Every irrelevant token in that window dilutes the agent's attention on your actual task. Context management is not a secondary concern. It is the primary engineering challenge of working with agents.
You are the architect, always
The agent generates. You decide. The moment you stop reading output critically, stop asking "why this approach?", the agent becomes a liability factory. Your job title did not change. You are still the engineer.
Quality is the fastest path
Every shortcut in code quality creates a future tax: a bug to triage, a refactor to plan, a test to backfill. Agents amplify this dynamic. Bad code generated fast is still bad code, and now there is more of it.
Context rot and how to fight it
Context rot is what happens when you keep working in a single agent conversation too long. Old completions, debug traces, abandoned approaches, and stale instructions accumulate. The signal to noise ratio collapses. The agent starts contradicting its own earlier output, hallucinating function signatures, and quietly ignoring constraints.
This is not a bug. It is a fundamental property of how attention works in transformers. Information buried in the middle of a long context gets lower effective weight than information at the start or end. Your critical requirement from early in the conversation may be functionally invisible by message forty.
MCP: structured context, not dumped context
The Model Context Protocol gives agents a way to request specific resources (files, schemas, docs) instead of having everything pasted into the prompt. Think of it as the difference between grepping for what you need and reading the entire codebase into your brain before writing one line. MCP lets the agent pull in exactly what is relevant to the current subtask, nothing more.
In practice: instead of pasting your entire schema, configure MCP so the agent can query it. Instead of copy pasting three files, point the agent at the directory and let it read what it needs. The context stays lean. The output stays sharp.
The hidden information problem
Large context windows create a dangerous illusion. Engineers assume "it's all in the prompt, so the agent knows it." But attention is not uniform. In a 100K token context, your architectural constraint from the first message might as well not exist by message forty. This is the "lost in the middle" problem and it is well documented in the research.
Small agents, clean boundaries
This is how Claude Code and Codex actually work under the hood, and why they produce better results than naive single context approaches. The orchestrator agent receives your high level task and decomposes it. Each subtask gets dispatched to a fresh child agent with a clean context containing only the files, types, and instructions relevant to that specific piece of work.
When the child agent completes, only its result (a diff, a file, a test outcome) flows back to the parent. The debugging conversation, the false starts, the exploration all stay contained. The parent context stays lean.
Context rot is not something you manage by being careful. It is something you prevent structurally by keeping each unit of work small enough to complete within a fresh context window. The right granularity is roughly "one module, one concern, one reviewable diff."
"Read the entire codebase,
understand the architecture,
implement the invitation flow,
write tests, update docs,
and make sure it works
with the existing auth."
Task 1: "Add invitations table.
See schema spec in spec.md §2."
Task 2: "POST /invitations handler.
Validate with Zod. Send email."
Task 3: "Tests for invitation flow."
Task 4: "Update API docs."
Plan, build, verify
The single highest leverage habit is mandatory plan mode. Before any implementation, make the agent output a plan. Then read it. Actually read it. This is where you catch 80% of problems at 1% of the cost.
Write a short spec
Three to five sentences describing what you want, what the inputs and outputs are, and what the constraints are. This forces you to think before the agent types. The spec also becomes the handoff document for the next task.
Ask the agent to plan
Use /plan or simply say "propose an approach before implementing." Read the plan. Ask "why this over alternative X?" This is where architectural decisions happen cheaply.
Build one piece at a time
Each subtask should complete in under five minutes of agent time. If it takes longer, the scope is too wide. Split it. One module, one concern, one diff.
Review with your eyes first
Read the output yourself before asking the agent to self review. Then ask specific questions: "Does the error handling cover the case where the token is expired?" Specific questions produce useful answers. "Is this good?" does not.
Refactor after it works
Once tests pass, use the agent to improve naming, extract shared logic, simplify conditionals. Mechanical refactoring of working code is the agent's sweet spot.
This loop in practice
# You write the spec (2 min)
"Add a GET /invitations/:token endpoint.
Validate the token exists in the invitations table.
If expired (>72h), return 410 Gone.
If valid, return the invitation details.
If not found, return 404."
# Agent plans (you review in 1 min)
"I'll add a route in src/routes/invitations.ts,
query the DB with a prepared statement,
check created_at against Date.now(),
and return the appropriate status codes."
# Agent implements (1 min)
# You review the diff (3 min)
# You ask: "What if created_at is null?"
# Agent fixes (30 sec)
# Run tests. Ship.
Total elapsed: about eight minutes. That includes thinking time. The old way would have been twenty to thirty minutes because most of that time goes to typing boilerplate and looking up date arithmetic syntax in your ORM.
Commands, skills, and plugins
Commands that matter
These are the core controls in Claude Code. Learning to use them reflexively is the difference between fighting the tool and flowing with it.
| Command | What it does | When to reach for it |
|---|---|---|
| /plan | Forces a plan before implementation | Start of every new task |
| /compact | Compresses the context window | When output quality starts drifting |
| /clear | Resets the conversation entirely | Between logically separate tasks |
| /review | Enters structured review mode | After each implementation step |
| /init | Generates a CLAUDE.md for your project | First time setting up in a repo |
Skills: encoding your team's standards
Skills are reusable instruction files that teach the agent your patterns. Instead of explaining your migration naming convention every time, you write it once. The agent loads it when relevant.
When creating a database migration:
1. Filename format: YYYYMMDDHHMMSS_description.sql
2. Always include a DOWN migration
3. Comment the business reason at the top
4. Never DROP columns. Rename with _deprecated suffix.
5. Test against a pg_dump of staging before merging.
Skills are not prompts. They are institutional knowledge in a format the agent can consume. Your senior engineer's "always do it this way" advice, made executable.
Plugins
Plugins like Superpowers extend what your agent can interact with beyond code: file systems, browsers, external APIs. The same principle applies: use them for focused, bounded tasks. A plugin that lets the agent read your staging logs to debug an issue is useful. Using it as a general purpose "do everything" button leads to context bloat.
Making agent speed safe
Agent output is probabilistic. It is usually correct, sometimes subtly wrong, occasionally confidently incorrect. You need layers of verification, not hope.
Tests as contracts
If the agent writes the feature, you write the test. If you write the spec, the agent writes the tests. Never let the same actor do both without review.
Linters and type checkers
Run them on every agent output, unconditionally. They catch unused imports, type mismatches, unreachable branches for free. These are exactly the categories of mistakes agents make most.
Review against the spec
"Does this handle the empty array case from line 3 of the spec?" is reviewable. "Does this look right?" is not.
Never auto merge
Every agent commit passes through human review. This is your architectural immune system. The five minutes of review cost nothing compared to the damage of subtle drift.
The right size for a task
Too small and you spend all your time on handoff overhead. Too large and context rot kills the output. The sweet spot is a task that produces a single reviewable diff, touches one logical concern, and completes within a clean context window.
Documentation as handoff
When you split a feature into four tasks, the document between them is critical. It is the spec, the type signatures, the edge cases. Without it, each new task starts with the agent rediscovering what the previous one already figured out.
# Invitation Flow
1. [done] Schema: invitations table (see migration 20260312)
2. [done] POST /invitations: create + send email
3. [next] GET /invitations/:token: validate + accept
4. [ ] UI: invitation form component
5. [ ] UI: accept invitation page
6. [ ] Integration tests for complete flow
7. [ ] Update API reference docs
The agent makes you smarter only if you let it
Here is the uncomfortable truth: if you only use agents to produce output faster, your skills will atrophy. You will become an operator, not an engineer. The fix is simple. Use the agent as a teaching tool, not just a typing tool.
When the agent writes something you do not fully understand, that is not a signal to merge. It is an invitation to ask questions.
Questions worth asking during review
# Instead of "looks good, merge"
"Why did you use a WeakMap here instead of a regular Map?"
"What is the time complexity of this lookup?"
"What happens if this promise rejects and nobody catches it?"
"Show me what changes if we used a discriminated union instead."
"Is there a way to do this without the extra dependency?"
"Walk me through the SQL query plan for this join."
Each of these questions takes ten seconds to ask and might teach you something you carry for years. The agent is infinitely patient. Use that. Ask "why" until you actually understand. Then decide whether to accept the code.
Honest breakdown
Not everything gets faster. Some things stay the same. Some things should stay the same.
Implementation and boilerplate tend to see the most compression, though results vary widely by codebase and task type. Debugging can improve because you can describe symptoms and get hypotheses quickly. Architecture, code review, and coordination stay roughly the same because they should. Those are the activities where your judgment matters most.
The compound effect
Meaningful time savings on routine work can free you to spend more time on design, which tends to reduce bugs. Documentation improves because drafts become cheap. Refactoring happens more often because the cost drops. Over time, these second order effects can compound. But only if the underlying work is consistent, sustainable, and quality focused.
Where agents help, struggle, and hurt
Where they excel
- CRUD endpoints, type definitions, config boilerplate
- Exploring multiple approaches to the same problem
- Mechanical refactoring of working code
- Generating tests for existing functions
- Drafting documentation and API references
- Explaining unfamiliar code during onboarding
Where they struggle
- Cross module changes spanning many files
- Complex state management and race conditions
- Performance tuning (they optimize locally, miss systemic)
- Legacy codebases with undocumented conventions
- Security critical paths requiring expert review
- Novel algorithms without clear precedent
Where they cause harm
- Accepting output without reading it
- Letting the agent decide system architecture
- 200K tokens of noise stuffed into context
- Agent writes both feature and tests, no human review
- Adding dependencies for what a five line function solves
- Copy paste patterns instead of shared abstractions
Traps disguised as productivity
Scope creep by convenience
Because the agent types fast, you keep adding scope. "Also handle pagination. And sorting. And filtering." Each addition degrades context and compounds complexity. Ship the minimum feature. Then iterate.
Deferred review
You tell yourself you will review it later. You will not. The code is freshest in your mind right after generation. Defer by an hour and you are reading unfamiliar code, losing the reasoning, rubber stamping to move on.
Assuming shared knowledge
The agent does not know your team's conventions, your production traffic patterns, or your deployment constraints unless you tell it. Explicitly. In the current context. Not in a message from three hours ago.
Speed as the metric
When you measure success by how fast you generated code, you optimize for the wrong thing. Measure instead: how many bugs escaped to production, how readable the code is six months later, how easy it is for a new teammate to understand.
Cognitive atrophy
If you stop thinking about why the code works and only check that it runs, your engineering judgment decays. This is the most dangerous long term risk. The fix is active engagement: ask why, evaluate alternatives, write the hard parts yourself sometimes.
Where your brain should spend its cycles
The whole point is to shift mental effort away from mechanical typing and toward design, review, and learning. If agents are adding cognitive overhead instead of removing it, something is misconfigured.
Notice that total cognitive effort does not decrease. It redistributes. You spend less on syntax and boilerplate, more on design decisions and code review. That is the correct trade. Design and review are where your experience creates the most value.
When things get noisy, return here
- I use agents to type faster, not to think less.
- I read every line I ship, regardless of who wrote it.
- I split tasks to fit clean context, not to generate more tickets.
- I write specs before implementations. Every time.
- I ask "why this approach?" before accepting any plan.
- I use review conversations to learn, not just to verify.
- I run linters, type checks, and tests on every agent output.
- I aim for sustainable speed gains and reject the fantasy of 20×.
- I value understanding over output volume.
- The code is mine. The bugs are mine. The architecture is mine.
The craft endures.
Tools change. Models improve. Context windows grow.
The fundamentals remain: clear thinking, clean code,
disciplined iteration, honest self assessment.
Stay sane. Ship well. Keep learning.