Architecture Overview

System Architecture

PiCrust is built with a clear separation of concerns, designed specifically for frontend-backend applications like Tauri desktop apps.

Core Components

AgentRuntime

The AgentRuntime is the orchestrator. It:

Spawns agents as independent async tasks
Maintains a registry of running agents by session ID
Provides global permission rules shared across all agents
Handles agent lifecycle (spawn, get, shutdown)

Key Characteristics:

Thread-safe (can be shared across threads)
Single runtime can manage hundreds of agents
Each agent runs independently

let runtime = AgentRuntime::new();

// Spawn an agent
let handle = runtime.spawn(session, |internals| agent.run(internals)).await;

// Get existing agent
let handle = runtime.get("session-id").await;

// List running agents
let running = runtime.list_running().await;

AgentHandle

The AgentHandle is your interface to a running agent. It:

Sends input to the agent
Subscribes to output streams
Sends permission responses
Checks agent state
Can interrupt or shutdown the agent

Key Characteristics:

Cloneable (can be shared across threads/tasks)
Non-blocking operations
Channel-based communication

// Send input
handle.send_input("Hello").await?;

// Subscribe to output
let mut rx = handle.subscribe();

// Check state
let state = handle.state().await;

// Control agent
handle.interrupt().await;
handle.shutdown().await;

StandardAgent

The StandardAgent is the brains. It:

Implements the main agent loop
Manages conversation with the LLM
Executes tools with permissions
Streams output to subscribers
Persists conversation history

Key Characteristics:

Stateful (maintains conversation context)
Async-first architecture
Extensible through hooks and context injection

AgentSession

The AgentSession handles persistence. It:

Stores conversation history on disk
Maintains session metadata
Supports parent-child relationships (for subagents)
Provides conversation naming

Key Characteristics:

Write-through persistence (immediate disk writes)
Stored in ./sessions/{session_id}/
JSON format for portability

// Create session
let session = AgentSession::new(
    "session-123",
    "assistant",
    "My Assistant",
    "Description",
)?;

// Load existing
let session = AgentSession::load("session-123")?;

// Get history without loading full session
let history = AgentSession::get_history("session-123")?;

Communication Flow

Input Flow

Output Flow

Permission Flow

Data Flow

Dual-Channel Architecture

The SDK uses a dual-channel approach for conversation data: 1. Stream (Ephemeral)

Real-time output via broadcast channels
Fast, in-memory
Lost if no subscribers
Perfect for live UIs

2. History (Persistent)

Disk-based storage
Durable, reliable
Always available
Ground truth for conversation state

Critical Pattern: Always subscribe BEFORE sending input, or you’ll miss early output chunks!

// Correct order
let mut rx = handle.subscribe();  // Subscribe first
handle.send_input("Hello").await?;  // Then send

// Wrong order
handle.send_input("Hello").await?;
let mut rx = handle.subscribe();  // Too late!

Agent Lifecycle

States

pub enum AgentState {
    Idle,                          // Waiting for input
    Processing,                    // Calling LLM
    WaitingForPermission,          // Awaiting user approval
    ExecutingTool {                // Running a tool
        tool_name: String,
        tool_use_id: String,
    },
    WaitingForSubAgent {           // Subagent spawned
        session_id: String,
    },
    WaitingForUserInput {          // AskUserQuestion pending
        request_id: String,
    },
    Done,                          // Completed
    Error { message: String },     // Failed
}

State Transitions

Threading Model

Async Tasks

Each agent runs as an independent tokio::task:

// Spawning creates a new task
let handle = runtime.spawn(session, |internals| agent.run(internals)).await;

// Agent runs on tokio runtime
tokio::spawn(async move {
    agent.run(internals).await
});

Thread Safety

AgentRuntime: Arc wrapping for shared access
AgentHandle: Cloneable, uses channels for communication
Tools: Must be Send + Sync
Sessions: Write-through to disk, no shared state

Concurrent Access

Multiple threads can safely:

Send input to the same agent
Subscribe to agent output
Check agent state
Access different agents simultaneously

Memory Model

What’s In-Memory

Current conversation context (for LLM calls)
Broadcast channel buffers (recent output chunks)
Permission rules (session, local, global)
Tool registry
MCP connections

What’s On-Disk

Complete conversation history (./sessions/{id}/history.json)
Session metadata (./sessions/{id}/metadata.json)
Debug logs (if enabled, ./sessions/{id}/debugger/)

Memory Efficiency

Prompt caching reduces redundant LLM processing
Sessions are loaded on-demand
History is only read when needed
Broadcast channels have bounded buffers

Extensibility Points

1. Custom Tools

Implement the Tool trait:

#[async_trait]
impl Tool for MyTool {
    fn name(&self) -> &str { "MyTool" }
    fn description(&self) -> &str { "..." }
    fn definition(&self) -> ToolDefinition { ... }
    async fn execute(&self, input: &Value, internals: &mut AgentInternals) -> Result<ToolResult> {
        // Your logic
    }
}

2. Hooks

Intercept agent behavior:

hooks.add(HookEvent::PreToolUse, |ctx| {
    // Modify or block tool execution
    HookResult::allow()
});

3. Context Injection

Add dynamic context to LLM calls:

chain.add_fn("timestamp", |internals, messages| {
    let time = chrono::Utc::now();
    inject_system_reminder(messages, &format!("Current time: {}", time));
});

4. Custom LLM Providers

Implement the LlmProvider trait:

#[async_trait]
impl LlmProvider for MyProvider {
    async fn send_message(&self, ...) -> Result<MessageResponse> {
        // Call your LLM API
    }
}

5. MCP Servers

Connect external tool providers:

let mcp = MCPClient::connect("http://localhost:8005/mcp").await?;
mcp_manager.add_service("filesystem", mcp).await?;

Design Principles

1. Frontend-First

The architecture prioritizes responsive UIs:

Streaming output for real-time feedback
Non-blocking permission requests
State visibility for UI updates

2. Type Safety

Rust’s type system prevents entire classes of errors:

Compile-time verification
No null pointer exceptions
Memory safety guarantees

3. Production-Ready

Built for real applications:

Session persistence (survive crashes)
Permission system (security)
Error handling (graceful degradation)
Debugger (troubleshooting)

4. Flexibility

Easy to customize and extend:

Pluggable LLM providers
Custom tools
Hook system
Context injection

5. Performance

Optimized for efficiency:

Async I/O (non-blocking)
Prompt caching (90% cost reduction)
Connection pooling
Minimal allocations

Comparison with Other Architectures

vs. Stateless Request/Response

PiCrust:

Maintains conversation state
Session persistence
Multi-turn interactions

Stateless:

No state between requests
Must send full history each time
Simpler but less capable

vs. Monolithic Agent

PiCrust:

Multiple independent agents
Subagent spawning
Runtime management

Monolithic:

Single agent instance
No isolation
Harder to scale

vs. Microservices

PiCrust:

Embedded in your application
Type-safe communication
No network overhead

Microservices:

Separate services
Network communication
More operational complexity

Next Steps

Now that you understand the architecture:

Runtime & Lifecycle

Deep dive into AgentRuntime and agent lifecycle

Sessions & Persistence

Learn about conversation persistence

Message Flow

Understand input/output communication

Agent States

Explore all agent states and transitions

​System Architecture

​Core Components

​AgentRuntime

​AgentHandle

​StandardAgent

​AgentSession

​Communication Flow

​Input Flow

​Output Flow

​Permission Flow

​Data Flow

​Dual-Channel Architecture

​Agent Lifecycle

​States

​State Transitions

​Threading Model

​Async Tasks

​Thread Safety

​Concurrent Access

​Memory Model

​What’s In-Memory

​What’s On-Disk

​Memory Efficiency

​Extensibility Points

​1. Custom Tools

​2. Hooks

​3. Context Injection

​4. Custom LLM Providers

​5. MCP Servers

​Design Principles

​1. Frontend-First

​2. Type Safety

​3. Production-Ready

​4. Flexibility

​5. Performance

​Comparison with Other Architectures

​vs. Stateless Request/Response

​vs. Monolithic Agent

​vs. Microservices

​Next Steps

Runtime & Lifecycle

Sessions & Persistence

Message Flow

Agent States

System Architecture

Core Components

AgentRuntime

AgentHandle

StandardAgent

AgentSession

Communication Flow

Input Flow

Output Flow

Permission Flow

Data Flow

Dual-Channel Architecture

Agent Lifecycle

States

State Transitions

Threading Model

Async Tasks

Thread Safety

Concurrent Access

Memory Model

What’s In-Memory

What’s On-Disk

Memory Efficiency

Extensibility Points

1. Custom Tools

2. Hooks

3. Context Injection

4. Custom LLM Providers

5. MCP Servers

Design Principles

1. Frontend-First

2. Type Safety

3. Production-Ready

4. Flexibility

5. Performance

Comparison with Other Architectures

vs. Stateless Request/Response

vs. Monolithic Agent

vs. Microservices

Next Steps