I tried to break it. It fixed itself.

One real deployment run is a data point, not a validation - we learned that last time. The deploy cartridge had handled a static site to Vercel cleanly - happy path, no surprises. Before calling it done I wanted something harder: migrations, environment config, a rollback that actually executes.

So I built a test app. Next.js, Prisma, Neon PostgreSQL. The kind of stack where a deployment has moving parts: schema migrations run against a live database, the app goes out, and if something fails you can’t just revert a file and pretend the database didn’t change. That’s the scenario I wanted the cartridge to navigate.

It took four runs to get there.

The setup

Getting the test app running took longer than it should have. A Git initialisation gone wrong sent the first push to the wrong repository. A Biome linter config rejected the generated Prisma client. Vercel’s GitHub integration auto-deployed a broken commit before I could run the cartridge manually. None of this was the cartridge’s fault - it was just friction, the kind that shows up whenever you’re assembling several tools for the first time.

By the time the baseline was green - app deployed, database connected, /api/health returning {"status":"ok","db":"connected"} - I had a working scaffold and a clear plan for the failure scenarios.

The cartridge that wouldn’t let me fail

Run two: I introduced a deliberate syntax error into page.tsx. The plan was to let the build fail after the migration had already applied, then see how the cartridge handled a VERIFY failure with a live schema change in play.

The cartridge found the broken file, fixed it, committed the change locally, and deployed successfully. VERIFY passed.

I stared at that for a moment.

On one hand: impressive. It identified the problem, classified it correctly, applied a clean fix. On the other hand: I had done everything deliberately to create a failure state, and the cartridge had quietly removed it. It believed it was doing the right thing. It probably was, technically. But it had taken the initiative without asking, and that put me in the passenger seat of my own deployment.

Run three: I broke /api/health instead - a deliberate throw new Error("deliberate failure") that would survive the build but fail at verification. Same result. The cartridge found the commented-out working implementation, restored it, and VERIFY passed again.

That’s when the concern landed properly. A VERIFY phase that remediates failures doesn’t tell you whether your deployment is healthy. It tells you whether the cartridge could patch it to look healthy. Those are different things.

What the findings actually were

Three things came out of those runs:

DEPLOY fixes code without asking. A deploy cartridge that modifies source files unilaterally is overstepping. The right behaviour is to surface the issue, classify it as a code problem rather than a deployment problem, and ask: abort and hand back, or fix and continue? The decision belongs to the developer.

Local commits aren’t flagged. When the cartridge did make a fix, it committed locally but didn’t push - which is the correct call. But the handoff didn’t mention it. The developer had to discover the uncommitted state from git log. That’s the kind of thing that gets missed at handoff time and causes confusion on the next run.

VERIFY must be read-only. This is the critical one. Verification is an observation phase. If it modifies files to make checks pass, the PASS result is meaningless. The rule is simple: VERIFY observes, reports, and verdicts. It does not write.

Run four

With those fixes in place, I ran it again. Broke /api/health, deployed as-is.

VERIFY failed cleanly. No remediation attempt. The failure was reported exactly as found: /api/health → 500, DB connectivity unconfirmed. ROLLBACK triggered, the previous deployment was promoted, and the post-rollback checks confirmed the previous version was stable.

The migration dilemma - the scenario I’d originally designed for, where a schema change makes rollback unsafe - didn’t fully materialise. The rollback happened quickly enough that no data had been written to the new column, so the window was still safe. That’s realistic. The dangerous case only surfaces when a broken deploy stays live long enough for the application to write against the new schema. The cartridge had the rollback window logic in place; it just didn’t need to invoke it.

Four runs. One genuine finding that changed the cartridge’s behaviour in a meaningful way, and two smaller ones that made the output more trustworthy. That’s a reasonable outcome for a validation run.

What “too clever” costs you

The cartridge fixing my deliberate failures wasn’t a bug. It was doing what a helpful assistant does - identifying a problem and solving it. The issue is that deployment is one of the few contexts where that instinct is actively harmful.

When a deployment fails, you need to know that it failed. You need the actual signal, not a corrected version of it. A tool that smooths over failures to keep the process moving is optimising for the wrong thing.

The fix is a gate, not a constraint. The cartridge can still offer to fix a code issue - that’s useful. It just has to ask first. The developer decides whether this is a fix-and-continue situation or a stop-and-investigate one. Keeping that decision in human hands is the point.

VERIFY doesn’t even get that option. It’s read-only. Full stop.

The deploy cartridge is now validated on something with real moving parts. The rollback path works. The failure classification gate works. VERIFY no longer fixes what it finds. It’s now in the public repo. If you run it on something that breaks it - I’d like to know. Feedback welcome.