# Coding Agent — Project Specification > **Audience**: Junior developer onboarding to this project. > **Stack**: Python · UV · microsandbox (MCP) · Textual (TUI) · pytest > **Goal**: A local coding agent with a TUI that can later be served as a web UI for remote access. --- ## What We're Building A coding agent that: - Accepts user prompts via a terminal UI - Uses Claude (via the Anthropic SDK) as the LLM - Executes all file and shell operations inside a microsandbox microVM - Exposes those operations via MCP so the tool layer is swappable - Can later be served over HTTP for remote/web access without rewriting core logic --- ## Project File Structure ``` coding-agent/ │ ├── pyproject.toml # UV project manifest — dependencies, scripts, tool config ├── .python-version # Pins Python version for UV ├── .env.example # Template for required env vars (copy to .env) ├── README.md │ ├── agent/ # Core agent logic — no UI concerns here │ ├── __init__.py │ ├── loop.py # The agentic loop: send message → get response → handle tool calls → repeat │ ├── tools.py # Tool definitions (schemas Claude sees) and dispatch table │ ├── history.py # Conversation history management │ └── config.py # Settings loaded from env vars (API keys, model name, safedir path) │ ├── sandbox/ # All microsandbox interaction lives here │ ├── __init__.py │ ├── session.py # Creates/destroys the sandbox session, exposes run(), holds lifecycle │ └── mcp_client.py # Connects to microsandbox's MCP server, wraps tool calls │ ├── tools/ # Individual tool implementations — each calls sandbox/mcp_client.py │ ├── __init__.py │ ├── bash.py # run_bash(command) → str │ ├── read.py # read_file(path) → str │ ├── write.py # write_file(path, content) → str │ ├── list_dir.py # list_dir(path) → str │ └── search.py # search_files(pattern) → str │ ├── ui/ │ ├── __init__.py │ ├── tui/ │ │ ├── __init__.py │ │ └── app.py # Textual app — renders chat, captures input, calls agent/loop.py │ └── web/ # Stubbed out — implemented later │ └── __init__.py # Placeholder — see Web UI section below │ ├── tests/ │ ├── conftest.py # Shared pytest fixtures (mock sandbox session, sample history, etc.) │ ├── test_loop.py # Unit tests for agentic loop logic │ ├── test_tools.py # Unit tests for each tool (mock the sandbox) │ ├── test_history.py # Tests for conversation history management │ └── test_sandbox.py # Integration tests for sandbox session (require msb server running) │ └── scripts/ └── start_sandbox_server.sh # Convenience: runs `msb server start --dev` ``` --- ## Dependency Overview Add these in `pyproject.toml` under `[project.dependencies]`: | Package | Purpose | Docs | |---|---|---| | `anthropic` | Anthropic SDK — LLM calls and MCP client support | https://docs.anthropic.com | | `microsandbox` | Python SDK for microsandbox VM sessions | https://github.com/zerocore-ai/microsandbox | | `textual` | TUI framework — the terminal interface | https://textual.textualize.io | | `python-dotenv` | Load `.env` file into environment | https://pypi.org/project/python-dotenv | | `pydantic` | Settings validation and tool schema modeling | https://docs.pydantic.dev | Dev dependencies (`[project.optional-dependencies] dev`): | Package | Purpose | |---|---| | `pytest` | Test runner | | `pytest-asyncio` | Async test support (needed — most code is async) | | `pytest-mock` | Mocking sandbox calls in unit tests | ### UV Quickstart ```bash # Install UV if not already installed curl -LsSf https://astral.sh/uv/install.sh | sh # Create project uv init coding-agent cd coding-agent # Add dependencies uv add anthropic microsandbox textual python-dotenv pydantic uv add --dev pytest pytest-asyncio pytest-mock # Run the TUI uv run python -m ui.tui.app # Run tests uv run pytest ``` --- ## Architecture: Concerns and Boundaries The most important rule: **each layer only talks to the layer directly below it.** ``` ┌─────────────────────────────────────┐ │ UI Layer (ui/) │ Renders output, captures input. │ Textual TUI | Web (later) │ No LLM calls. No sandbox calls. └──────────────┬──────────────────────┘ │ calls ┌──────────────▼──────────────────────┐ │ Agent Layer (agent/) │ Owns the loop. Talks to Anthropic API. │ loop.py · tools.py · history.py │ Decides which tools to call. └──────────────┬──────────────────────┘ │ calls ┌──────────────▼──────────────────────┐ │ Tools Layer (tools/) │ One file per tool. Pure functions. │ bash · read · write · list · grep │ No LLM knowledge. No UI knowledge. └──────────────┬──────────────────────┘ │ calls ┌──────────────▼──────────────────────┐ │ Sandbox Layer (sandbox/) │ Owns the VM session and MCP connection. │ session.py · mcp_client.py │ Everything executes in here. └─────────────────────────────────────┘ │ microVM (isolated) safedir mounted in ``` **Why this matters**: When you swap the TUI for a web UI, you only touch `ui/`. When you swap microsandbox for a different execution backend, you only touch `sandbox/`. The agent loop doesn't change. --- ## Key Implementation Notes ### 1. The Agentic Loop (`agent/loop.py`) This is the heart of the project. The pattern is: 1. Add user message to history 2. Send full history to Claude 3. If response contains tool calls → execute them → add results to history → go to 2 4. If response is plain text → return it to the UI ```python # Rough shape of loop.py async def run_turn(user_message: str, history: list, sandbox) -> str: history.append({"role": "user", "content": user_message}) while True: response = await call_claude(history) if response.stop_reason == "end_turn": return response.text if response.stop_reason == "tool_use": tool_results = await execute_tools(response.tool_calls, sandbox) history.append({"role": "assistant", "content": response.content}) history.append({"role": "user", "content": tool_results}) # loop continues ``` Reference: https://docs.anthropic.com/en/docs/build-with-claude/tool-use ### 2. Tool Definitions (`agent/tools.py`) Claude needs two things for tools: a JSON schema describing each tool, and a dispatch function that routes tool calls to the right implementation. ```python # tools.py exports two things: TOOL_SCHEMAS = [ { "name": "bash", "description": "Run a shell command in the sandbox", "input_schema": { "type": "object", "properties": { "command": {"type": "string", "description": "The shell command to run"} }, "required": ["command"] } }, # ... one entry per tool ] async def dispatch(tool_name: str, tool_input: dict, sandbox) -> str: # routes to tools/bash.py, tools/read.py etc. ``` ### 3. Sandbox Session (`sandbox/session.py`) The sandbox session should be created once at agent startup and reused for the entire conversation. This preserves state between tool calls (installed packages, created files, env vars). ```python # sandbox/session.py from microsandbox import PythonSandbox class SandboxSession: async def __aenter__(self): self._sb = await PythonSandbox.create(name="coding-agent") return self async def run(self, command: str) -> str: exec = await self._sb.run(command) return await exec.output() async def __aexit__(self, *args): await self._sb.stop() ``` Reference: https://github.com/zerocore-ai/microsandbox/blob/main/sdk/README.md ### 4. MCP vs Direct SDK microsandbox supports two integration patterns: - **Direct SDK** (`PythonSandbox.create()`) — simpler, Python-native, recommended to start with - **MCP server** — microsandbox exposes an MCP server; the Anthropic SDK can connect to it directly, and tool definitions come from the server automatically Start with the direct SDK (`sandbox/session.py`). The `sandbox/mcp_client.py` file is stubbed for later when you want to switch to the MCP path. The MCP approach reduces boilerplate but adds a moving part. MCP reference: https://github.com/zerocore-ai/microsandbox/blob/main/MCP.md Anthropic MCP docs: https://docs.anthropic.com/en/docs/build-with-claude/mcp ### 5. The TUI (`ui/tui/app.py`) Use **Textual** for the TUI. It's async-native which fits well since the agent loop is async. A minimal Textual app has: - A `RichLog` or `Markdown` widget for displaying conversation - An `Input` widget for capturing user messages - An `on_input_submitted` handler that calls `agent/loop.py` and appends the result Reference: https://textual.textualize.io/guide/ ### 6. Configuration (`agent/config.py`) Use `pydantic-settings` to load from `.env`: ```python from pydantic_settings import BaseSettings class Settings(BaseSettings): anthropic_api_key: str model: str = "claude-sonnet-4-5-20250929" safedir: str = "./workspace" max_tokens: int = 8096 class Config: env_file = ".env" ``` --- ## Environment Variables Copy `.env.example` to `.env` and fill in: ``` ANTHROPIC_API_KEY=sk-ant-... MODEL=claude-sonnet-4-5-20250929 SAFEDIR=./workspace ``` --- ## Testing Strategy **Unit tests** (no sandbox required — mock everything): - `test_loop.py` — mock Claude responses, verify tool calls are dispatched correctly - `test_tools.py` — mock `SandboxSession.run()`, verify each tool formats input/output correctly - `test_history.py` — verify history trimming, message formatting **Integration tests** (require `msb server start --dev`): - `test_sandbox.py` — actually runs commands in a VM, verifies output - Mark these with `@pytest.mark.integration` and skip by default: ```python # conftest.py def pytest_addoption(parser): parser.addoption("--integration", action="store_true") def pytest_collection_modifyitems(config, items): if not config.getoption("--integration"): skip = pytest.mark.skip(reason="pass --integration to run") for item in items: if "integration" in item.keywords: item.add_marker(skip) ``` Run integration tests: `uv run pytest --integration` --- ## Web UI — Future Path (No Node Required Yet) When ready to add a web UI, the approach that avoids Node: 1. Add **FastAPI** + **uvicorn** to dependencies 2. Create `ui/web/app.py` — a FastAPI app with a `/chat` endpoint that calls `agent/loop.py` 3. Use **Server-Sent Events (SSE)** for streaming responses 4. Serve a minimal HTML/CSS frontend as a static file from FastAPI The agent layer doesn't change at all. You're just adding a second entry point alongside the TUI. When the project is mature enough to warrant a proper frontend, that's the point to introduce a JS framework. Until then, FastAPI + plain HTML gets you remote access without the Node toolchain. --- ## Prerequisites Before Writing Code 1. Install microsandbox: `curl -sSL https://get.microsandbox.dev | sh` 2. Start the server: `msb server start --dev` 3. Pull the Python image: `msb pull microsandbox/python` 4. Set your `ANTHROPIC_API_KEY` in `.env` --- ## Suggested Build Order 1. `agent/config.py` — settings first, everything imports this 2. `sandbox/session.py` — get a VM running and verify you can execute commands 3. `tools/bash.py` + `tools/read.py` — minimal tool set to prove the loop works 4. `agent/tools.py` — schemas and dispatch for those two tools 5. `agent/history.py` — simple list wrapper to start 6. `agent/loop.py` — wire it all together, test in a plain Python script first 7. `ui/tui/app.py` — put a Textual face on the working loop 8. Remaining tools (`write`, `list_dir`, `search`) 9. Tests throughout — write them alongside each module, not at the end