What the agent can do
- Hydrate project state. Before making strategic decisions, the agent reads your full project state, graph nodes, research log, project files, and active compute leases, so its responses are grounded in what’s actually in your project, not just the current chat window.
- Search and synthesize. The agent can query unified search, start Oracle jobs, run Tracer code searches, and launch Deep Research jobs, then turn the results into graph nodes, log entries, or inline summaries.
- Stage experiments. When the agent proposes an experiment, it creates a staged empirical node with a full 8-section blueprint, hypothesis, falsification criterion, method, expected outcomes, baseline, metrics, data config, and rigor review. The proposal appears as a card in the chat for you to review.
- Write the research log. The agent writes durable log entries automatically as work progresses, recording decisions, dead ends, pivots, and results without you having to prompt it.
- Inspect files and artifacts. The agent can list directories, read files, write files, and search repositories within your project volume, giving it access to code, datasets, plots, and generated reports.
- Spawn sub-agents. After you approve a compute proposal, the agent can spawn a sandboxed sub-agent to execute an experiment. The sub-agent receives a structured payload, not chat history, and reports back through the telemetry system.
Important boundaries
The agent is intentionally constrained in specific ways:Spending compute, starting a GPU lease or spawning a sub-agent, always requires explicit approval from you. The agent cannot acquire compute on its own, regardless of what it’s been instructed to do.
The ReAct loop from your perspective
When you send a message in the agent chat, the following happens:Your message is sent to the agent
The chat UI sends your message to the Thesis backend, which starts a server-sent event (SSE) stream back to your browser.
The agent thinks and calls tools
The agent reasons about your message, then calls tools, searching the knowledge base, reading graph nodes, querying the log, or staging a new node. You see text tokens streaming in real time, and tool calls appear as labeled events in the chat.
Tool results update project state
Each tool call returns a result. If the tool mutates state, creating a node, writing a log entry, starting an Oracle job, the change is durable immediately. The Thesis graph and log are updated in the backend, not held in memory.
The agent continues until done
The agent may run multiple rounds of thinking and tool calls before finishing. Each round re-evaluates whether the task is complete. The loop ends when the agent decides it has finished, or after a maximum of 10 rounds.
What sub-agents receive
When the agent spawns a sub-agent after approval, the sub-agent does not receive your chat history. Instead, it receives a structured payload compiled from durable project state:| Field | Contents |
|---|---|
THESIS_BLUEPRINT_JSON | The full 8-section experiment blueprint |
THESIS_NODE_ID | The ID of the experiment node to execute |
THESIS_PROJECT_ID | The project scope |
THESIS_CALLBACK_URL | Endpoint for telemetry, signals, and asset uploads |
THESIS_RUNNER_TOKEN | Auth token for the callback URL |
Sub-agents do not inherit changes made to the project after they’re spawned. If your plan changes materially mid-run, you can send an explicit signal to the running agent or spawn a new run after the current one completes.