Can you run AI coding agents unattended overnight?

Yes, and it can ship real work — but only behind guardrails. Run each agent on its own branch or worktree, restrict the files it can touch, require tests and pre-commit checks to pass, cap the cost, and never give it access to production, auth, or payment systems. Without those, an unattended agent is a liability.

What's the biggest risk of running agents unattended?

That the same prompt that succeeds nine times fails catastrophically on the tenth — and no one is watching when it does. The mitigations are isolation (branches, sandboxes), restriction (no prod, no secrets), and verification (tests must pass before anything is committed).

How do I limit what an overnight agent can do?

Restrict file access, run it in an isolated branch or sandboxed environment, enforce pre-commit hooks with linting and tests, set a hard cost cap, and explicitly forbid touching deployment pipelines, auth, and payment code. Define the forbidden zones before you start, not after.

ArticlesMethod

How to run AI coding agents overnight, safely

Unattended agents can ship real work while you sleep — or wreck things. The guardrails that make overnight runs safe.

Stuart LeoJune 8, 20263 min read

The dream is real: brief an agent before bed, wake up to a finished feature on a branch. Unattended agents genuinely can ship work overnight. The nightmare is also real: wake up to a branch full of confident damage, or a cost alert, or worse. The difference between the two is entirely guardrails.

Here's how to run agents unattended so the morning is a pleasant surprise, not a cleanup.

The promise and the risk

When you're in the loop, you catch problems as they happen — a wrong turn, a bad assumption, a runaway cost. Unattended, none of that. The agent runs the full loop with no one to stop it, which is exactly what makes it powerful and exactly what makes it dangerous.

The core risk is subtle: an agent that behaves perfectly while you watch can behave very differently on the run you don't. As the practitioners writing up overnight agent runs put it, the same prompt might succeed nine times and fail catastrophically on the tenth — and unattended means no one's there for the tenth. So you don't rely on it behaving. You build the cage.

Isolate: branches and worktrees

Rule one: an overnight agent never works on main, and never where it can step on anything live.

Give it its own branch, or better, its own git worktree — a separate working directory entirely. Better still for risky work, a sandboxed environment (a container or microVM) with no access to your host machine. Whatever it does, it does in a box you can inspect in the morning and throw away if it's wrong.

Restrict: file scope and forbidden zones

Rule two: name what it may touch, and what it absolutely may not.

Define forbidden zones explicitly, before the run: no deployment pipelines, no auth systems, no payment code, no secrets. These are the places where a confident mistake becomes a catastrophe, and there's no overnight task worth that risk. Restrict the file scope to the feature at hand. The agent should wake up in a small, well-fenced yard, not the whole property.

Stuart Leo

Never give an overnight agent the keys to production, auth, or payments. The upside of an unattended change there is never worth the downside.

Verify: tests and pre-commit gates

Rule three: nothing gets committed that hasn't passed the gate.

Enforce pre-commit hooks that run linting and the test suite, so the agent literally cannot commit broken work. Give it a full test suite that runs without production dependencies, and ideally a CI pipeline it can trigger. This is where test-driven development earns its keep doubly — the tests are the unattended agent's only honest judge of whether it succeeded. No green, no commit.

Cap: cost and the no-prod rule

Rule four: bound the blast radius, in dollars and in scope.

Set a hard cost cap so a stuck agent looping all night can't run up a fortune. Pair it with the no-prod rule from above as the two non-negotiables: a money ceiling and a production fence. With those in place, the worst case is a wasted branch and a small bill — annoying, not disastrous.

A safety file like GUARDRAILS.md — or, better, your contextbase — helps here too: it carries the lessons from past failures into the next run, so the agent stops repeating the mistakes that bit you before.

Never give an overnight agent access to prod, auth, or payments — and never let it run without a test gate. Get those two right and unattended runs go from reckless to genuinely useful.

Start here: write acceptance criteria the agent can run unattended, read the field note on an overnight run gone wrong, or read the method.

FAQ

Can you run AI coding agents unattended overnight?: Yes, and it can ship real work — but only behind guardrails. Run each agent on its own branch or worktree, restrict the files it can touch, require tests and pre-commit checks to pass, cap the cost, and never give it access to production, auth, or payment systems. Without those, an unattended agent is a liability.
What's the biggest risk of running agents unattended?: That the same prompt that succeeds nine times fails catastrophically on the tenth — and no one is watching when it does. The mitigations are isolation (branches, sandboxes), restriction (no prod, no secrets), and verification (tests must pass before anything is committed).
How do I limit what an overnight agent can do?: Restrict file access, run it in an isolated branch or sandboxed environment, enforce pre-commit hooks with linting and tests, set a hard cost cap, and explicitly forbid touching deployment pipelines, auth, and payment code. Define the forbidden zones before you start, not after.

Write acceptance criteria your agent can check itself

The difference between an agent that drifts and one that self-verifies is acceptance criteria you can run. How to write briefs with criteria the agent checks itself.

The overnight run that went wrong

A field note on letting an agent run unattended overnight, waking to a branch full of confident damage, and the guardrails I never skip now.

C² vs GUARDRAILS.md: safety that compounds

GUARDRAILS.md captures safety lessons so an autonomous agent stops repeating failures. A useful file — and how it relates to a contextbase that compounds.

All articles