← back

Context Drift Is the Hidden Tax on Manual AI Workflows

#ai-workflows#developer-tools#product-thinking#infrastructure#context-management

Why replacing a working development tool mid-project, before finishing in-progress ship work, was the right sequencing call for my AI-assisted build loop.

Spent Sunday installing Claude Code and deciding whether it replaces the Cursor-based loop I've been running on Duskglow for about ten days. The short version: it does. The longer version is what made that decision clear enough to act on, and that's the part worth writing down.

I walked into the session expecting to evaluate features. I walked out having diagnosed an architecture problem I'd been optimizing around without naming.

The real inefficiency wasn't typing speed

My current Duskglow loop looks roughly like this. Plan in a Claude Project chat. Receive generated code. Paste it into files in Cursor. Run it. Hit errors. Describe the errors back to Claude. Receive revised code. Paste again. Repeat until it works.

When I asked whether Cursor's tab autocomplete could improve this, the honest answer was no. The reason surprised me.

The inefficiency isn't typing. It's context drift. Three versions of the code exist in parallel at any given moment: the AI's model of what the code looks like, what's actually on disk, and my mental model of both. Every iteration, those three silently diverge. The tokens and time are visible costs. The invisible one is the AI's advice quality degrading across turns, because it's advising against a version of the codebase that no longer exists.

You can't fix context drift with a better autocomplete. Autocomplete helps when you type code. Agentic AI helps when you prompt for code. Before picking a tool, diagnose which mode the workflow actually lives in. I had been picking tools for the wrong mode.

Don't optimize the layer. Replace it.

Once I named the drift problem, the Cursor-vs-Claude-Code question answered itself. Claude Code edits files directly on disk, which collapses the three versions into one source of truth. The advantage is structural. Feature comparisons would have pointed the other way.

That distinction matters because feature-level comparisons would have pulled me toward the wrong decision. Cursor's autocomplete is better. Cursor's in-editor chat UI is denser. Claude Code is a terminal-first tool with a VS Code extension, which is a less polished starting surface. If I had walked through feature parity, Cursor would have won most rows.

But the root cause of the inefficiency wasn't at the feature layer, so feature comparisons were going to mislead me. Structural problems don't respond to feature-level fixes. When the diagnosis says "the layer is wrong," the response is to replace the layer, not to shop for a better version of the wrong one.

A quieter signal I noticed partway through: the features I reach for most in Cursor are not the ones Cursor is known for. That's worth naming on its own. If your strongest reasons for using a tool aren't the tool's headline features, the tool is probably not the right fit. You've just adapted.

Infrastructure first, ship work second

Here's where I almost shipped a mistake.

My initial plan for the next Duskglow chat was to validate the Claude Code transition by running it through a piece of in-progress Phase 2 work I had queued (Ship 1 Part 3, the crypto infrastructure scaffolding). Two birds, one session.

Reading back the plan, I caught the priority inversion. The transition affects every future Duskglow session. Ship 1 Part 3 is a single session's worth of work. Running new ship work on infrastructure I've already decided to replace compounds the old infrastructure's cost across more work, not less.

The correct order is transition first, validate it with a small verification task, then resume Ship 1 Part 3 in the new workflow. Infrastructure changes that affect all future work take precedence over individual ship tasks. Especially when the ship task is in-flight, actually, because it's tempting to justify continuing on old infrastructure with "I'll just finish this one thing first."

That justification is how a quick transition turns into a slow one, and how a decided replacement drags for weeks because there's always one more urgent thing queued on the old stack.

Validate before subtracting

My conviction on Claude Code was high by the end of the session. The temptation was to cancel Cursor immediately. I'm not going to.

The rule I'm holding: removing a known-working tool should be evidence-based, not vibe-based. A few real working sessions on Duskglow through Claude Code, with Cursor still installed as a fallback, and then the cancel decision gets made on actual experience rather than day-one excitement.

Small discipline, but it compounds. Most of the tooling regrets I've had in this project were acts of premature subtraction. Cancelling a subscription, deleting a file, removing a dependency before I had a working replacement fully validated. Keeping a known-working backup for two more weeks costs almost nothing. Removing it too early and needing it back costs measured in hours and scramble.

Mental models

A few things from today I expect to keep reaching for.

Context drift is the hidden tax on manual copy-paste AI workflows. The visible costs are tokens and time. The invisible cost is the AI's model of the code diverging from what's on disk, which silently degrades advice quality across iterations. Collapsing the three versions (AI model, disk reality, mental model) into one is the architectural fix. Better autocomplete is not.

Autocomplete helps you type code. Agentic AI helps you prompt for code. Two different workflows. Picking tools for the wrong one will feel productive and leave you optimizing around the real problem without naming it.

Where quality compounds across many uses, invest upstream. Where it's one-shot, don't. Applies to prompts, docs, subagents, and code. A reusable artifact repays a careful first pass many times over. A one-shot output doesn't, and over-engineering it is a leak.

Infrastructure changes take precedence over in-progress ship work. Once you've decided to replace the layer, finish replacing it before loading more work onto the old stack. The urge to finish one more thing first is exactly how transitions stall out.

Validate before subtracting. Removing a known-working tool should be evidence-based. The cost of keeping it around for a few more sessions is almost nothing. The cost of removing it too early is measured in the scramble to get it back.

Plans designed without system context need a gap sweep before execution. When a plan is made in one system about another system, its blast radius is unknown. Sweeping the target system before acting is cheap insurance against surprises that would otherwise land mid-implementation. The version of me that wants to "just start" is usually the version that pays for the shortcut later.

Nothing in today's work shipped in Duskglow itself. The PADR didn't get regenerated, the trigger clause didn't get added, the crypto infrastructure is still queued. By some measures it was a zero-progress day.

By the measure that actually matters, which is velocity across the remaining Phase 2 work, it was one of the higher-leverage days of the month. Replacing the execution layer is worth a Sunday.