Your agent doesn't have to live on your laptop

The Forge

Last issue, in Sparks, I said we might build one of these. Here it is.

Claude Managed Agents went to public beta, and it changes where your agent lives. Until now, running an agent meant you owned the whole stack: the agent loop, the tool execution, the sandbox, the hosting. You wrote the while-loop that calls the model, parses the tool call, runs it, and feeds the result back. It worked, and it ran on your machine, which means it stopped the moment your laptop slept.

Managed Agents move that whole harness onto Anthropic's infrastructure. You define an agent (a model, a system prompt, a set of tools), point it at a sandbox, and hand it a task. Claude runs the loop, executes the tools in a secure sandbox, and streams the results back. You never write the loop. You never stand up a server.

Here's the part to get straight before you build one. A managed agent is not "the same agent, but easier to host." The convenience is real, but it's not the point. The point is the work that runs when you're not there. A session can run for minutes or hours, it survives pauses, and with a scheduled deployment it kicks itself off on a cron without anyone hitting a button. That is the thing a local agent loop cannot do, and it's the only reason to reach for this.

The short version: run it managed when the work needs to outlive your laptop session. Keep the loop local when you need to control every step yourself.

The Blueprint

Three resources get you a self-running agent: an agent, an environment, and a scheduled deployment. The ant CLI ships all of them. Copy, paste, customize.

Every Managed Agents request needs the managed-agents-2026-04-01 beta header. The CLI and the SDKs set it for you, so you won't see it below, but that's the flag that gates the whole API.

Step 0: install the CLI and set your key. You need an Anthropic Console account and an API key.

# macOS
brew install anthropics/ant/ant
# Linux or WSL: see the install docs for the curl one-liner

export ANTHROPIC_API_KEY="sk-ant-..."
ant --version

Step 1: create the agent. The agent is a reusable, versioned config: model, system prompt, tools. The agent_toolset_20260401 tool type turns on the full pre-built set (bash, file read and write, web search and fetch). Capture the returned id, you'll reference it everywhere.

AGENT_ID=$(ant beta:agents create \
  --name "Weekly dependency auditor" \
  --model '{id: claude-opus-4-8}' \
  --system "You audit a project's dependencies for known CVEs and outdated major versions. Write findings to a dated markdown report. Be terse and specific." \
  --tool '{type: agent_toolset_20260401}' | jq -er '.id')

Step 2: create the environment. The environment is the sandbox the agent runs in. This one is a cloud sandbox with open networking so it can reach a package registry. For real work you lock the networking down, more on that in The Anvil.

ENVIRONMENT_ID=$(ant beta:environments create \
  --name "auditor-env" \
  --config '{type: cloud, networking: {type: unrestricted}}' | jq -er '.id')

Step 3: put it on a schedule. This is the centerpiece. A scheduled deployment binds the agent, the environment, and an opening message to a cron expression. Claude starts a fresh session on that cadence and does the work. The deployment takes a YAML body, which is the cleanest way to see the whole thing at once:

DEPLOYMENT_ID=$(ant beta:deployments create <<YAML | jq -er '.id'
name: Friday dependency audit
agent: $AGENT_ID
environment_id: $ENVIRONMENT_ID
initial_events:
  - type: user.message
    content:
      - type: text
        text: Clone the repo, audit every dependency for known CVEs and stale majors, and write a dated report to audit-report.md.
schedule:
  type: cron
  expression: "0 17 * * 5"
  timezone: America/Denver
YAML
)

The two fields that matter are expression and timezone. The cron above is standard POSIX (minute hour day-of-month month day-of-week), so 0 17 * * 5 is every Friday at 5 PM. The timezone is an IANA identifier, and the schedule matches literal wall-clock time in that zone. The response echoes schedule.upcoming_runs_at with the next fire times, which is your confirmation the cron parsed the way you meant.

Step 4: don't wait until Friday to find out it works. Trigger it now with a manual run, then read the run log. The manual run starts a real session immediately and records a run with trigger_context.type: "manual".

ant beta:deployments run --deployment-id "$DEPLOYMENT_ID"

# then watch the runs, and filter to just the failures
ant beta:deployment-runs list --deployment-id "$DEPLOYMENT_ID"
ant beta:deployment-runs list --deployment-id "$DEPLOYMENT_ID" --has-error

That's a complete agent that audits your dependencies every Friday afternoon and leaves a report waiting for you, running on infrastructure you don't own and didn't configure. Swap the system prompt and the opening message and it's a weekly competitor scan, a Monday inbox digest, or a nightly data-quality check instead.

The Anvil

Now the part the launch demos skip: where managed agents bite, and how to stop the bleeding.

Stateful by design means no Zero Data Retention and no HIPAA BAA. This is the one to read twice. Managed Agents are stateful on purpose: sessions persist conversation history, sandbox filesystem state, and outputs server-side so they can resume cleanly. The cost of that is the feature is not eligible for Zero Data Retention or a HIPAA Business Associate Agreement right now. If you're pointing one of these at a client's regulated data, stop and check that first. You can delete sessions and uploaded files through the API at any time, but deletion you have to remember is not the same as data that was never retained. Plan the cleanup, or keep regulated workloads on a path that supports the agreement you need.

Cron matches wall-clock time, and daylight saving will surprise you. The schedule fires on literal local time, which is usually what you want and occasionally a trap. On a spring-forward day, a wall-clock time that doesn't exist (like 2 AM) never fires. On a fall-back day, a time that happens twice fires twice. If a missed run or a double run would actually hurt, schedule outside the 1 to 3 AM window or set the timezone to UTC and do the math yourself.

You gave up the loop, and that's the trade, not a bug. The reason this is so little code is that Anthropic runs the agent loop. You don't get to intercept every tool call, rewrite a result mid-flight, or branch the control flow on your own logic. If your agent needs that kind of step-by-step control, you don't want Managed Agents, you want the Messages API and your own loop. Pick the managed harness for autonomy, not for fine-grained steering.

A green schedule is not a green run. A deployment can sit there looking healthy while every run quietly fails. If the environment gets archived, or session creation hits a rate limit, the run is recorded as an error and the schedule just tries again next time. Worse, if the agent itself is archived the whole deployment auto-archives, and if a subagent it depends on is archived the deployment auto-pauses. Don't trust the schedule, watch deployment-runs list --has-error. That log is the only place the silent failures show up.

The rule of thumb: managed agents are for work that should keep running when you're not looking. If you need to watch every step, this is the wrong tool, and that's fine. Match the harness to the job: autonomy and a schedule here, fine-grained control on the Messages API.

Sparks

A few more things worth your attention this week:

Scheduled deployments cap at 1,000 per org and apply up to 10 seconds of jitter to spread load. That's plenty for cron-style jobs, but it means this is not your real-time trigger. For event-driven work, wire up webhooks instead of a tight schedule.
Multi-agent orchestration is in here too. An agent's callable_agents field lets one managed agent invoke another, so a planner can hand work to specialists. It's a research preview behind an access request right now. We may build a multi-agent version of today's setup in a future issue.
The cloud sandbox isn't your only option. Managed Agents also run on Claude Platform on AWS and against self-hosted sandboxes on your own infrastructure, which is the path when data residency or compliance rules out the managed cloud.

The Smith's Take

For a long time, "I built an agent" meant "I have a script running in a terminal I have to keep open." The agent and your attention were the same resource. The moment you closed the laptop, the agent was gone. Managed Agents break that link, and the builders who get value from them aren't the ones chasing the easier setup. They're the ones who looked at a recurring task and asked one question: does this need me in the room, or does it just need to happen?

That's the whole call. Setup convenience is a nice side effect. The real shift is that the work now runs on a clock you set and infrastructure you don't babysit. A tool you run by hand and an agent that runs itself on Friday at 5 are answers to different questions, and the skill is knowing which one you're holding.

Pick one task you already do on a cadence (a weekly scan, a digest, a report you build by hand every Monday) and put it behind a scheduled deployment this week. Trigger a manual run, read the deployment-runs log, and then leave it alone until its first real fire. Once you've watched an agent do the work while you weren't looking, you'll know exactly which of your standing chores have been waiting for a schedule.

Build agents that actually work.

- Michael