Interactive Architecture Guide

AI Agent Architecturefrom Reasoning to Production

Explore the four layers that power modern AI agent systems (LLM reasoning, orchestration logic, data infrastructure, and tool integration) and learn how Pixeltable eliminates the infrastructure complexity.

Interactive diagramCode examples14 deep-dive resources

The Four Layers of Agent Architecture

Click any layer to explore its role, capabilities, and related resources

Build an Agent in 40 Lines

Define tools as UDFs, wire them to an LLM, and let Pixeltable handle state, caching, and lineage automatically

agent_workflow.py

1import pixeltable as pxt
2from pixeltable.functions import openai, invoke_tools
3
4# Define tools as UDFs: Pixeltable tracks everything
5@pxt.udf
6def get_weather(city: str) -> dict:
7    return {"temp": 72, "conditions": "sunny"}
8
9@pxt.query
10def search_knowledge(query: str):
11    sim = docs.text.similarity(string=query)
12    return docs.order_by(sim, asc=False).limit(5).select(docs.text)
13
14# Load MCP tools from any server
15mcp_tools = pxt.mcp_udfs('http://localhost:8000/mcp')
16tools = pxt.tools(get_weather, search_knowledge, *mcp_tools)
17
18# Create agent workflow: state is automatic
19agent = pxt.create_table('my_agent', {'prompt': pxt.String})
20
21# LLM reasons, selects tools, Pixeltable orchestrates
22agent.add_computed_column(
23    response=openai.chat_completions(
24        model='gpt-4o',
25        messages=[{
26            'role': 'user',
27            'content': agent.prompt
28        }],
29        tools=tools
30    )
31)
32
33# Automatic tool execution with full lineage
34agent.add_computed_column(
35    tool_output=invoke_tools(tools, agent.response)
36)
37
38# Insert a prompt: everything runs automatically
39agent.insert(prompt='What is the weather in Seattle?')
40
41# Full history, versioning, and reproducibility built in
42agent.select(agent.prompt, agent.response, agent.tool_output).collect()

What Pixeltable handles:State persistenceTool executionCachingLineageVersioning

Multimodal Agent Capabilities

Build agents that seamlessly process video, images, audio, and documents with unified infrastructure

Video & Image Agents

Build agents that understand visual content with automatic frame extraction, object detection (YOLOX), and visual similarity search (CLIP).

Object detection with YOLOX Video agent course Building multimodal apps

Document & RAG Agents

Create agents that process PDFs, transcribe audio with Whisper, and answer questions with production-grade RAG, all with automatic embedding sync.

Production RAG guide Whisper transcription Document processing

Multi-Agent Workflows

Coordinate specialized agents using Pixelagent's agent-as-tool pattern. Shared infrastructure means shared state, context, and history across agents.

Agent collaboration Context graphs & decision traces Pixelagent launch

Deep Dive Resources

Explore guides, tutorials, and architecture deep-dives for building production AI agents

Architecture & Foundations

Memory & State

Tools & MCP

Multi-Agent & Team

Start Building AI Agents

Stop wrestling with infrastructure. Pixeltable handles state, caching, lineage, and multimodal data, so you can focus on agent logic.

pip install pixeltableOpen SourceApache 2.0

Agent Architecture Guide Pixelagent Framework Live Agent Demo Interactive Playground