Boyan Balev Senior Full Stack Engineer

2026 Edition

Agentic coding,
the sane way

What actually works when you pair with AI agents every day. No hype, no magic numbers. The practices that produce better code with less friction.

Audience: Working engineers
Goal: Sustainable velocity
Premise: Quality compounds

Scroll

01 THE REALITY CHECK

The speed trap

There is a version of this pitch where agents make you 10 or 20 times faster. That version is wrong. It conflates typing speed with engineering velocity. The bottleneck in software was never how fast you type. It was always understanding the problem, designing a solution, verifying correctness, and communicating with your team.

Agents can meaningfully reduce the cost of one thing: turning a clear thought into working code. That is valuable when it works well. But the thought still has to be clear. The code still has to be correct. And you still have to understand what you shipped.

The realistic, sustainable gain varies by context. For well scoped, routine tasks it can approach 2 to 3 times your previous speed. For novel or complex work, the gain may be modest or even negative if you skip review. Where does the benefit come from, when it appears?

Zero communication latency

When your pair partner is the agent, you skip the Slack message, the context transfer, the "let me explain the codebase" overhead. The feedback loop shrinks from hours to seconds.

Mechanical work offloaded

CRUD endpoints, type definitions, test boilerplate, config files. The code you know exactly what it should look like but still takes 20 minutes to type. That drops to seconds.

Exploration becomes cheap

Instead of committing to one approach, you can ask the agent to sketch three. Comparing concrete implementations is faster and more honest than debating abstractions in your head.

The gain is not "write more code." It is "solve more problems with less noise," when the workflow is right.

The engineers who chase 20× end up with codebases nobody can reason about, including themselves. Six months later they are slower than when they started because every change triggers unexpected breakage in code they never actually understood. Greed is the enemy here. Aim to be genuinely, measurably, sustainably faster. That is enough.

02 FIRST PRINCIPLES

Three axioms

Every technique in this guide follows from these. When in doubt, return here.

Context is finite and precious

Every agent has a context window. Every irrelevant token in that window dilutes the agent's attention on your actual task. Context management is not a secondary concern. It is the primary engineering challenge of working with agents.

You are the architect, always

The agent generates. You decide. The moment you stop reading output critically, stop asking "why this approach?", the agent becomes a liability factory. Your job title did not change. You are still the engineer.

Quality is the fastest path

Every shortcut in code quality creates a future tax: a bug to triage, a refactor to plan, a test to backfill. Agents amplify this dynamic. Bad code generated fast is still bad code, and now there is more of it.

03 CONTEXT MANAGEMENT

Context rot and how to fight it

Context rot is what happens when you keep working in a single agent conversation too long. Old completions, debug traces, abandoned approaches, and stale instructions accumulate. The signal to noise ratio collapses. The agent starts contradicting its own earlier output, hallucinating function signatures, and quietly ignoring constraints.

This is not a bug. It is a fundamental property of how attention works in transformers. Information buried in the middle of a long context gets lower effective weight than information at the start or end. Your critical requirement from early in the conversation may be functionally invisible by message forty.

Output quality over conversation length

Comparing unmanaged context vs. periodic resets with clean handoffs

MCP: structured context, not dumped context

The Model Context Protocol gives agents a way to request specific resources (files, schemas, docs) instead of having everything pasted into the prompt. Think of it as the difference between grepping for what you need and reading the entire codebase into your brain before writing one line. MCP lets the agent pull in exactly what is relevant to the current subtask, nothing more.

In practice: instead of pasting your entire schema, configure MCP so the agent can query it. Instead of copy pasting three files, point the agent at the directory and let it read what it needs. The context stays lean. The output stays sharp.

The hidden information problem

Large context windows create a dangerous illusion. Engineers assume "it's all in the prompt, so the agent knows it." But attention is not uniform. In a 100K token context, your architectural constraint from the first message might as well not exist by message forty. This is the "lost in the middle" problem and it is well documented in the research.

Practical fix: Repeat critical constraints at the point where they matter. If your API must return 422 for validation errors, say so in the task prompt for that endpoint. Not just in the project overview from three hours ago.

04 SUBAGENT ARCHITECTURE

Small agents, clean boundaries

This is how Claude Code and Codex actually work under the hood, and why they produce better results than naive single context approaches. The orchestrator agent receives your high level task and decomposes it. Each subtask gets dispatched to a fresh child agent with a clean context containing only the files, types, and instructions relevant to that specific piece of work.

When the child agent completes, only its result (a diff, a file, a test outcome) flows back to the parent. The debugging conversation, the false starts, the exploration all stay contained. The parent context stays lean.

Subagent context isolation

Each child agent works in a clean scope. Only results propagate upward.

Context rot is not something you manage by being careful. It is something you prevent structurally by keeping each unit of work small enough to complete within a fresh context window. The right granularity is roughly "one module, one concern, one reviewable diff."

Overloaded prompt

"Read the entire codebase,
understand the architecture,
implement the invitation flow,
write tests, update docs,
and make sure it works
with the existing auth."

Decomposed tasks

Task 1: "Add invitations table.
  See schema spec in spec.md §2."
Task 2: "POST /invitations handler.
  Validate with Zod. Send email."
Task 3: "Tests for invitation flow."
Task 4: "Update API docs."

05 THE WORKFLOW

Plan, build, verify

The single highest leverage habit is mandatory plan mode. Before any implementation, make the agent output a plan. Then read it. Actually read it. This is where you catch 80% of problems at 1% of the cost.

Write a short spec

Three to five sentences describing what you want, what the inputs and outputs are, and what the constraints are. This forces you to think before the agent types. The spec also becomes the handoff document for the next task.

Ask the agent to plan

Use /plan or simply say "propose an approach before implementing." Read the plan. Ask "why this over alternative X?" This is where architectural decisions happen cheaply.

Build one piece at a time

Each subtask should complete in under five minutes of agent time. If it takes longer, the scope is too wide. Split it. One module, one concern, one diff.

Review with your eyes first

Read the output yourself before asking the agent to self review. Then ask specific questions: "Does the error handling cover the case where the token is expired?" Specific questions produce useful answers. "Is this good?" does not.

Refactor after it works

Once tests pass, use the agent to improve naming, extract shared logic, simplify conditionals. Mechanical refactoring of working code is the agent's sweet spot.

This loop in practice

# You write the spec (2 min)
"Add a GET /invitations/:token endpoint.
 Validate the token exists in the invitations table.
 If expired (>72h), return 410 Gone.
 If valid, return the invitation details.
 If not found, return 404."

# Agent plans (you review in 1 min)
"I'll add a route in src/routes/invitations.ts,
 query the DB with a prepared statement,
 check created_at against Date.now(),
 and return the appropriate status codes."

# Agent implements (1 min)
# You review the diff (3 min)
# You ask: "What if created_at is null?"
# Agent fixes (30 sec)
# Run tests. Ship.

Total elapsed: about eight minutes. That includes thinking time. The old way would have been twenty to thirty minutes because most of that time goes to typing boilerplate and looking up date arithmetic syntax in your ORM.

06 TOOLS AND CONFIGURATION

Commands, skills, and plugins

Commands that matter

These are the core controls in Claude Code. Learning to use them reflexively is the difference between fighting the tool and flowing with it.

Command	What it does	When to reach for it
/plan	Forces a plan before implementation	Start of every new task
/compact	Compresses the context window	When output quality starts drifting
/clear	Resets the conversation entirely	Between logically separate tasks
/review	Enters structured review mode	After each implementation step
/init	Generates a CLAUDE.md for your project	First time setting up in a repo

Skills: encoding your team's standards

Skills are reusable instruction files that teach the agent your patterns. Instead of explaining your migration naming convention every time, you write it once. The agent loads it when relevant.

.claude/skills/db_migration.md

When creating a database migration:
  1. Filename format: YYYYMMDDHHMMSS_description.sql
  2. Always include a DOWN migration
  3. Comment the business reason at the top
  4. Never DROP columns. Rename with _deprecated suffix.
  5. Test against a pg_dump of staging before merging.

Skills are not prompts. They are institutional knowledge in a format the agent can consume. Your senior engineer's "always do it this way" advice, made executable.

Plugins

Plugins like Superpowers extend what your agent can interact with beyond code: file systems, browsers, external APIs. The same principle applies: use them for focused, bounded tasks. A plugin that lets the agent read your staging logs to debug an issue is useful. Using it as a general purpose "do everything" button leads to context bloat.

07 GUARDRAILS

Making agent speed safe

Agent output is probabilistic. It is usually correct, sometimes subtly wrong, occasionally confidently incorrect. You need layers of verification, not hope.

Tests as contracts

If the agent writes the feature, you write the test. If you write the spec, the agent writes the tests. Never let the same actor do both without review.

Linters and type checkers

Run them on every agent output, unconditionally. They catch unused imports, type mismatches, unreachable branches for free. These are exactly the categories of mistakes agents make most.

Review against the spec

"Does this handle the empty array case from line 3 of the spec?" is reviewable. "Does this look right?" is not.

Never auto merge

Every agent commit passes through human review. This is your architectural immune system. The five minutes of review cost nothing compared to the damage of subtle drift.

Code quality recovery by guardrail layer

Each layer catches a different category of agent error

08 TASK DECOMPOSITION

The right size for a task

Too small and you spend all your time on handoff overhead. Too large and context rot kills the output. The sweet spot is a task that produces a single reviewable diff, touches one logical concern, and completes within a clean context window.

Quality vs. task scope

There is a clear sweet spot around single module, single concern tasks

Documentation as handoff

When you split a feature into four tasks, the document between them is critical. It is the spec, the type signatures, the edge cases. Without it, each new task starts with the agent rediscovering what the previous one already figured out.

features/invitation_flow/roadmap.md

# Invitation Flow

1. [done] Schema: invitations table (see migration 20260312)
2. [done] POST /invitations: create + send email
3. [next] GET /invitations/:token: validate + accept
4. [    ] UI: invitation form component
5. [    ] UI: accept invitation page
6. [    ] Integration tests for complete flow
7. [    ] Update API reference docs

09 LEARNING, NOT JUST SHIPPING

The agent makes you smarter only if you let it

Here is the uncomfortable truth: if you only use agents to produce output faster, your skills will atrophy. You will become an operator, not an engineer. The fix is simple. Use the agent as a teaching tool, not just a typing tool.

When the agent writes something you do not fully understand, that is not a signal to merge. It is an invitation to ask questions.

Questions worth asking during review

# Instead of "looks good, merge"

"Why did you use a WeakMap here instead of a regular Map?"
"What is the time complexity of this lookup?"
"What happens if this promise rejects and nobody catches it?"
"Show me what changes if we used a discriminated union instead."
"Is there a way to do this without the extra dependency?"
"Walk me through the SQL query plan for this join."

Each of these questions takes ten seconds to ask and might teach you something you carry for years. The agent is infinitely patient. Use that. Ask "why" until you actually understand. Then decide whether to accept the code.

The review as a learning conversation: Do not just check for bugs. Ask the agent to explain its choices. Compare its approach to what you would have done. When it picks a better approach, understand why. When it picks a worse one, articulate why and make it fix it. Both paths sharpen your judgment.

10 WHERE THE TIME GOES

Honest breakdown

Not everything gets faster. Some things stay the same. Some things should stay the same.

Time allocation: with and without agents

Activities where agents help vs. activities that remain human

Implementation and boilerplate tend to see the most compression, though results vary widely by codebase and task type. Debugging can improve because you can describe symptoms and get hypotheses quickly. Architecture, code review, and coordination stay roughly the same because they should. Those are the activities where your judgment matters most.

The compound effect

Meaningful time savings on routine work can free you to spend more time on design, which tends to reduce bugs. Documentation improves because drafts become cheap. Refactoring happens more often because the cost drops. Over time, these second order effects can compound. But only if the underlying work is consistent, sustainable, and quality focused.

11 HONEST ASSESSMENT

Where agents help, struggle, and hurt

Where they excel

CRUD endpoints, type definitions, config boilerplate
Exploring multiple approaches to the same problem
Mechanical refactoring of working code
Generating tests for existing functions
Drafting documentation and API references
Explaining unfamiliar code during onboarding

Where they struggle

Cross module changes spanning many files
Complex state management and race conditions
Performance tuning (they optimize locally, miss systemic)
Legacy codebases with undocumented conventions
Security critical paths requiring expert review
Novel algorithms without clear precedent

Where they cause harm

Accepting output without reading it
Letting the agent decide system architecture
200K tokens of noise stuffed into context
Agent writes both feature and tests, no human review
Adding dependencies for what a five line function solves
Copy paste patterns instead of shared abstractions

12 COMMON PITFALLS

Traps disguised as productivity

Scope creep by convenience

Because the agent types fast, you keep adding scope. "Also handle pagination. And sorting. And filtering." Each addition degrades context and compounds complexity. Ship the minimum feature. Then iterate.

Deferred review

You tell yourself you will review it later. You will not. The code is freshest in your mind right after generation. Defer by an hour and you are reading unfamiliar code, losing the reasoning, rubber stamping to move on.

Assuming shared knowledge

The agent does not know your team's conventions, your production traffic patterns, or your deployment constraints unless you tell it. Explicitly. In the current context. Not in a message from three hours ago.

Speed as the metric

When you measure success by how fast you generated code, you optimize for the wrong thing. Measure instead: how many bugs escaped to production, how readable the code is six months later, how easy it is for a new teammate to understand.

Cognitive atrophy

If you stop thinking about why the code works and only check that it runs, your engineering judgment decays. This is the most dangerous long term risk. The fix is active engagement: ask why, evaluate alternatives, write the hard parts yourself sometimes.

13 COGNITIVE LOAD

Where your brain should spend its cycles

The whole point is to shift mental effort away from mechanical typing and toward design, review, and learning. If agents are adding cognitive overhead instead of removing it, something is misconfigured.

Cognitive effort distribution

Traditional workflow vs. well configured agentic workflow

Notice that total cognitive effort does not decrease. It redistributes. You spend less on syntax and boilerplate, more on design decisions and code review. That is the correct trade. Design and review are where your experience creates the most value.

14 PRINCIPLES TO WORK BY

When things get noisy, return here

I use agents to type faster, not to think less.
I read every line I ship, regardless of who wrote it.
I split tasks to fit clean context, not to generate more tickets.
I write specs before implementations. Every time.
I ask "why this approach?" before accepting any plan.
I use review conversations to learn, not just to verify.
I run linters, type checks, and tests on every agent output.
I aim for sustainable speed gains and reject the fantasy of 20×.
I value understanding over output volume.
The code is mine. The bugs are mine. The architecture is mine.

Closing thought

The craft endures.

Tools change. Models improve. Context windows grow.
The fundamentals remain: clear thinking, clean code,
disciplined iteration, honest self assessment.

Stay sane. Ship well. Keep learning.

Agentic coding,the sane way

The speed trap

Zero communication latency

Mechanical work offloaded

Exploration becomes cheap

Three axioms

Context is finite and precious

You are the architect, always

Quality is the fastest path

Context rot and how to fight it

MCP: structured context, not dumped context

The hidden information problem

Small agents, clean boundaries

Plan, build, verify

Write a short spec

Ask the agent to plan

Build one piece at a time

Review with your eyes first

Refactor after it works

This loop in practice

Commands, skills, and plugins

Commands that matter

Skills: encoding your team's standards

Plugins

Making agent speed safe

Tests as contracts

Linters and type checkers

Review against the spec

Never auto merge

The right size for a task

Documentation as handoff

The agent makes you smarter only if you let it

Questions worth asking during review

Honest breakdown

The compound effect

Where agents help, struggle, and hurt

Where they excel

Where they struggle

Where they cause harm

Traps disguised as productivity

Scope creep by convenience

Deferred review

Assuming shared knowledge

Speed as the metric

Cognitive atrophy

Where your brain should spend its cycles

When things get noisy, return here

The craft endures.

Agentic coding,
the sane way