Run a sandboxed sub-agent on an approved experiment

After you approve an experiment proposal, Thesis delegates execution to a sandboxed sub-agent that runs independently of your chat session. The sub-agent receives a compiled payload from durable Thesis state, not your chat history, and reports telemetry back as it works. This guide covers every step from approval to result.

The approval flow

Approve the proposal in the UI

On the experiment proposal card in the agent chat (or in the Agents tab), click Run with your chosen GPU SKU and provider. This triggers two things in sequence:

The backend writes compute_approval.json to the node and logs a proposal_approved entry in the research log.
The frontend injects a synthesized hidden message into the chat stream, passing the approval token to the agent.

No compute is acquired yet, the token is simply minted and ready to be consumed.

If the agent needs to request approval programmatically rather than via the UI card, it calls thesis_request_compute_grant_approval to initiate the approval session and prompt you for confirmation.

The agent calls spawn_experiment_agent

The main agent receives the approval token in the injected message and emits a spawn_experiment_agent tool call. The backend validates the token, then executes this sequence:

Creates an approval session and compute grant tied to the token.
Calls thesis_compute_acquire to lease a GPU from the selected provider.
Mints a runner token scoped to the sub-agent’s callback URL.
Updates the node status to in_progress.
Starts the sandbox runner on the provider when supported.

The agent acknowledges the result in chat, and the Agents tab refreshes to show the new active lease.

The sub-agent executes

The sub-agent runs in an isolated sandbox. It receives the following compiled payload, sourced entirely from durable Thesis state, not from chat history:

Variable	What it contains
`THESIS_BLUEPRINT_JSON`	The full 8-section experiment blueprint from `blueprint.json`
`THESIS_NODE_ID`	The ID of the empirical node being executed
`THESIS_PROJECT_ID`	The project scope for graph and filesystem access
`THESIS_CALLBACK_URL`	Endpoint for telemetry, signals, and asset uploads
`THESIS_RUNNER_TOKEN`	Auth token for sub-agent callbacks (not your API key)

The sub-agent does not receive your chat history. If the research plan changes materially while the sub-agent is running, spawn a new run or send an explicit signal via send_to_agent.

Because the sub-agent works from the blueprint, the quality of the experiment design directly determines the quality of the execution. A well-specified method_section and data_config_section reduce the risk of the sub-agent making ambiguous implementation choices.

Monitor telemetry in the Agents tab

Switch to the Agents tab to watch the sub-agent’s progress. The backend persists runner telemetry events in real time:

Event type	What it signals
`started`	Sub-agent initialized and running
`heartbeat`	Periodic liveness signal
`milestone_report`	A named checkpoint the sub-agent explicitly logged
`stdout` / `stderr`	Process output
`asset_manifest`	A new file or result has been generated
`error_trace`	An exception or failure
`done`	Sub-agent finished

Milestones, generated files, and results are surfaced back in the agent chat as asset_card events, you can see plots, reports, and datasets without leaving the conversation.

Release compute when done

When the sub-agent finishes, the lease is released automatically. If you need to stop a run early or clean up a stalled lease, use:

{
  "name": "thesis_compute_release",
  "arguments": {
    "lease_id": "<lease-id>"
  }
}

To release all active leases at once:

{
  "name": "thesis_compute_release_all",
  "arguments": {}
}

Released leases appear in the Completed section of the Agents tab.

Compute providers

Thesis supports two compute backends:

Provider	Best for	Notes
Modal	Sandboxed sub-agent runs	Default runner; starts automatically on approval. Connect in Settings → API Keys.
Lambda Cloud	GPU instances for longer training runs	Set `provider: "LAMBDA"` in the proposal. Connect your Lambda Cloud account in Settings → API Keys.

Browse available GPU SKUs with thesis_compute_list_options before approving a proposal if you want to select a specific instance type.

Safety boundaries

Thesis is intentionally conservative with compute:

The approval token is short-lived and single-use. Prompt compliance alone cannot start a run.
Sub-agent callbacks use runner tokens, not your API key, so a compromised sub-agent cannot access your account.
The main agent cannot spend compute on its own, every run requires an explicit approval action from you.

Start

Concepts

Context

Guides

Run a sandboxed sub-agent on an approved experiment

The approval flow

Compute providers

Safety boundaries

Start

Concepts

Context

Guides

​The approval flow

​Compute providers

​Safety boundaries

The approval flow

Compute providers

Safety boundaries