Conversation-Driven Development — Claude Code

The workflow#

State a small, well-scoped intent. Watch Claude propose. Accept, or redirect with a single sentence of feedback. Verify after each step. Commit when green. Repeat.

That’s the entire workflow. It sounds trivial. It is not — most of the difference between week-one frustration and week-four flow with Claude Code is the discipline to actually run this loop rather than the prompt-and-hope variant where you describe the whole feature and walk away.

When to reach for it#

Conversation-driven development is the default mode for non-trivial work with Claude Code. Specifically:

Multi-file changes. Any task that touches more than one or two files benefits from the conversation loop because you want to catch wrong turns before they propagate.
Unfamiliar code. When the model knows the codebase better than you do (or you’re orienting in a new project), the back-and-forth surfaces the right context fast.
Hard-to-specify tasks. When you can’t precisely say what “done” looks like up front, the conversation is how you discover it.
Tasks where verification is cheap. Tight test suites, fast type-checkers, immediate visual feedback in the browser — all amplify the conversation loop because each iteration costs seconds.

Don’t reach for it for truly trivial tasks (rename a variable, fix a typo) — just edit. Don’t reach for it for long-horizon autonomous work (overnight large refactors, batch processing) — that’s headless / agent-SDK territory.

Step-by-step#

1. Open with intent, not instructions#

Bad: “Read auth.ts, then read user.ts, then modify login() to call refresh().”

Good: “Add a refresh-token flow to login() in auth.ts. Existing access tokens should keep working.”

The first version constrains the model’s plan to the steps you guessed at; the second lets it discover the right plan from the code. Intent-first prompts produce better solutions because the model often has better information about the code structure than you do at the moment you start.

2. Choose a permission mode#

Default mode is right for most work — Claude proposes risky actions and waits for approval. Accept-edits mode skips approval on file edits but still confirms shell commands; useful when you trust the direction and want to remove micro-frictions. Bypass mode is the “trust me” mode — useful for sandboxed agents, dangerous for daily work.

If you’re starting a task larger than a few minutes of effort, start in plan mode. The 30 seconds spent on a plan saves the hour spent on the wrong implementation.

3. Establish a baseline#

Before the first change, know that the existing tests pass and the type-checker is clean. If they don’t, fix that first or you can’t tell what your changes broke.

For UI work, get the dev server running and the page open. For backend work, get the test suite watching. The baseline is what makes verification cheap.

4. Take one small step#

Let Claude make the smallest reasonable change toward the intent. Read the diff. Run the test. Confirm the dev page still renders.

The temptation is to let it run for ten minutes and then review a 400-line diff. Resist it. A 40-line diff verified four times beats a 400-line diff reviewed once.

5. Redirect with a sentence#

When Claude goes the wrong way — wrong abstraction, missed edge case, weird naming — push back in one sentence. “Use a constant instead of inlining the magic number.” “That’s the wrong file — the logic lives in src/services/, not src/api/.”

You don’t need to re-explain the whole task. Claude has the context; it just made a wrong turn. Single-sentence redirects keep the conversation moving.

6. Commit when green#

When the step passes verification, commit it. Small green commits are save points. They mean you can always reset to a known-good state without losing earlier work.

Don’t let the diff grow across multiple steps without committing — the longer you go without a save point, the worse a bad redirect costs you.

7. Repeat or hand off#

After the commit, either continue in the same conversation (next step toward the same intent) or wrap up. Long sessions accumulate context drag; if the next step is unrelated, start a new session.

For tasks that span days, write the in-progress state into CLAUDE.md or memory so the next session picks up cleanly.

Anti-patterns#

The shapes that don’t work:

Prompt-and-hope. “Implement the whole feature with tests” and walk away. Sometimes works. Often produces a wrong-shape solution that takes longer to rework than to write from scratch. The bigger the task, the worse this ratio.
Over-specified prompts. Telling Claude exactly which files to touch and in what order. You lose the model’s planning ability and inherit all the risk of having guessed wrong. State intent; trust the planning.
Skipping verification. “Looks right, ship it.” This is how a 90%-correct diff becomes a 60%-correct merge after the third unverified step. Verify after every step; that’s the whole point of small steps.
Letting redirects accumulate. When you redirect three times in a row and the model keeps missing, the problem is usually the intent, not the model. Stop, restate the intent, possibly start fresh.
Ignoring the tests when they go red. “Probably flaky, let me re-run.” Sometimes true. Often the model made a wrong assumption. Read the failure before re-running.
Mega-sessions without commits. Four hours, fifteen unverified changes, one giant commit at the end. If anything’s wrong, you don’t know what or when. Save often.
Treating Claude as autocomplete. Letting it suggest, accepting most of what it says, never redirecting. The model is excellent but not infallible; the conversation is where you contribute.

Evaluation#

How do you know the workflow is working?

Time-to-green-commit is short. Each step takes minutes, not hours. If commits are rare, your steps are too big.
Redirect rate is low. You’re redirecting Claude on fewer than ~20% of proposals. If higher, your intent is unclear or your tests aren’t catching the right things.
Test suite stays green after each commit. Always. If you ever ship a commit with a red suite, you’ve lost a save point.
You can describe what you changed without looking at git log. The session was scoped enough that the changes fit in your head. If you can’t, the session was too long or too unfocused.
You can hand off to a teammate at any commit boundary. Each green commit is a clean state someone else could pick up. If commits aren’t clean handoffs, you’re under-committing.

A common failure mode for week-one users: long sessions, big diffs, deferred verification. The fix is mechanical: shorter sessions, smaller steps, faster verification. The model’s quality is usually fine; the workflow shape is the problem.

Conversation-driven development. Small steps, verify each, commit when green. Slower-feeling per step; faster overall because rework is rare. Easy to redirect mid-task. Natural commit history.

Prompt-and-hope. Big prompt, long generation, review at the end. Faster-feeling per step; slower overall because rework is common. Hard to redirect — you usually start over. Messy commit history if any.

One concrete metric I track

For sessions I want to evaluate, I count commits per hour of session time. Sustained good work sits around 3–6 commits per hour. Below 1, the session is meandering — usually too-big steps or unclear intent. Above 10, the steps are likely trivial; consider whether Claude Code is the right tool versus an inline edit. The sweet spot is small enough to verify, big enough to be worth the round-trip.