In this guide you’ll scaffold a project, start the Polos server, and run a coding agent that can create files, execute shell commands, and search a codebase - all with built-in sandbox restrictions and interactive approval in the terminal.
In a second terminal, start an interactive session with the coding agent:
polos run coding_agent
Give it a task:
> Build a REST API with Express and SQLite — one endpoint: GET /orders
The agent will start working - writing files, running commands, and building the project inside a sandboxed environment. When it needs to execute a command or write a file, polos run prompts you for approval directly in the terminal:
COMMAND APPROVAL REQUIRED The agent wants to run a command: Command: npm init -y && npm install express better-sqlite3 Directory: /workspace Approve this command? (y/n): y -> Approved. Resuming...
Approve each step and the agent will create the files, run them, and report the results.
Open http://localhost:5173 to see your agent execution in the Polos dashboard. You can trace every step, see tool calls, and inspect the agent’s reasoning.
The coding agent gets six built-in sandbox tools (exec, read, write, edit, glob, grep) via a single sandboxTools() call:
TypeScript
Python
src/agents/coding-agent.ts
import { anthropic } from '@ai-sdk/anthropic';import { defineAgent, maxSteps, sandboxTools } from '@polos/sdk';const tools = sandboxTools({ env: 'local',});export const codingAgent = defineAgent({ id: 'coding_agent', model: anthropic('claude-sonnet-4-5'), systemPrompt: 'You are a coding agent with access to sandbox tools. Use your tools to read, write, and execute code.', tools, stopConditions: [maxSteps({ count: 30 })],});
src/agents/coding_agent.py
from polos import ( Agent, max_steps, MaxStepsConfig, sandbox_tools, SandboxToolsConfig,)sandbox = sandbox_tools( SandboxToolsConfig( env="local", ))coding_agent = Agent( id="coding_agent", provider="anthropic", model="claude-sonnet-4-5", system_prompt=( "You are a coding agent with access to sandbox tools. " "Use your tools to read, write, and execute code." ), tools=[*sandbox], stop_conditions=[max_steps(MaxStepsConfig(count=30))],)
Key things to notice:
env: "local" runs tools directly on the host - for container isolation in production, use "docker" instead (see Sandbox Tools)
Exec security defaults to approval-always for local mode - every shell command suspends for your approval
The agent gets six tools automatically: exec, read, write, edit, glob, grep
The worker registers agents and workflows with the orchestrator. polos dev runs this automatically.
TypeScript
Python
src/main.ts
import 'dotenv/config';import { Polos } from '@polos/sdk';// Import agents and workflows for registrationimport './agents/coding-agent.js';import './agents/assistant-agent.js';import './workflows/text-review/agents.js';import './workflows/text-review/workflow.js';const polos = new Polos();await polos.serve();
src/main.py
import asynciofrom dotenv import load_dotenvfrom polos import Polos# Import agents and workflows for registrationimport agents.coding_agentimport agents.assistant_agentimport workflows.text_review.agentsimport workflows.text_review.workflowload_dotenv()async def main(): async with Polos() as polos: await polos.serve()if __name__ == "__main__": asyncio.run(main())
Once you’re running, the CLI gives you full control:
polos dev # Start server + worker with hot reloadpolos run <agent> # Start an interactive session with an agentpolos agent list # List available agentspolos tool list # List available toolspolos logs <agent> # Stream logs from agent runs
create-polos scaffolded a project with agents, a workflow, and configuration
polos dev started the orchestrator, connected your worker, and began watching for changes
polos run started an interactive session - the agent decided to write a file, and Polos suspended execution for your approval
You approved - Polos resumed execution exactly where it left off
The agent ran code in a sandboxed environment with built-in tools
Every step was durably checkpointed - if the process crashed mid-execution, it would resume from the last completed step without re-running the LLM calls you already paid for
This is Polos in action: sandboxed execution, human-in-the-loop approval, and durable state - all without writing retry logic, state machines, or approval infrastructure yourself.