PixeltablePixeltable Logo
  • Pricing
  • DatasetsNEW
K
GitHubDiscord

Iterate on Your DataNot Your Infrastructure

Declarative
Storage
OrchestrationRetrieval
Incremental
85% lesspipeline codeObvio
Shipped in 3 daysnot 3 weeksVariata
From the creators of Apache Parquet and Impala and engineers who worked at:
Apple
Google
Amazon
Facebook
Airbnb
Cloudera
MapR
Dremio
Oracle
IBM
Apple
Google
Amazon
Facebook
Airbnb
Cloudera
MapR
Dremio
Oracle
IBM
Apple
Google
Amazon
Facebook
Airbnb
Cloudera
MapR
Dremio
Oracle
IBM
Apache 2.0
▶ Demo
01
Acquire
Ingest video, audio, docs from any source
02
Enrich
Auto-annotate with AI models & UDFs
03
Curate
Search, filter, snapshot, version
04
Export
Parquet, PyTorch, COCO, Pandas
Full guide
1# Video → frames → detections → export
2import pixeltable as pxt
3from pixeltable.iterators import FrameIterator
4from pixeltable.functions import yolox, openai
5
6# 01 Acquire — create table with multimodal types
7videos = pxt.create_table('ml.videos', {
8 'video': pxt.Video,
9 'title': pxt.String,
10 'source': pxt.String,
11})
12
13# 02 Enrich — extract frames, detect objects, describe
14frames = pxt.create_view('ml.frames', videos,
15 iterator=FrameIterator.create(video=videos.video, fps=1)
16)
17frames.add_computed_column(
18 detections=yolox(frames.frame, model_id='yolox_s', threshold=0.5)
19)
20frames.add_computed_column(
21 caption=openai.vision(
22 prompt='Describe this frame in one sentence.',
23 image=frames.frame, model='gpt-4o-mini'
24 )
25)
26
27# 03 Curate — filter and query enriched data
28results = frames.where(frames.caption.like('%person%')).order_by(
29 frames.pos_msec
30).select(frames.frame, frames.detections, frames.caption).collect()
31
32# 04 Export — to ML-ready formats
33from pixeltable.io import export_parquet
34export_parquet(frames, 'training_data/')
35df = frames.select(frames.frame, frames.detections).collect().to_pandas()
ml.videostable
ColumnTypeComputed With
videoVideo
titleString
sourceString
ml.framesview
ColumnTypeComputed With
frameImageFrameIterator(fps=1)
detectionsJsonyolox(frame)
captionStringopenai.vision(frame)
outputexport
export_parquet(frames)
frames.collect().to_pandas()
Insert a video → frames extracted → objects detected → captions generated. Automatic.

Multimodal Data, Made Simple

Video, audio, images, and documents as first-class data types — with storage, orchestration, and retrieval unified under one table interface.

Loading architecture diagram...

Replace Complexitywith One-Liners

Every capability that used to require a separate system is now a single function call.

CapabilityTraditional ApproachWhat You Write
Video storageS3 bucket + IAM + upload scriptspxt.create_table(..., {'video': pxt.Video})
Frame extractionFFmpeg scripts + output managementFrameIterator.create(video=..., fps=1)
Object detectionModel serving + GPU management + batch scriptsadd_computed_column(detections=yolox(...))
Vision descriptionsAPI client + retry logic + rate limiting + result storageadd_computed_column(description=openai.vision(...))
Audio extractionFFmpeg + temp file managementadd_computed_column(audio=extract_audio(...))
TranscriptionWhisper API client + chunking + storageadd_computed_column(transcript=transcribe(...))
Image searchCLIP embedding + Pinecone + sync scriptsadd_embedding_index('frame', embedding=clip.using(...))
Text searchAnother embedding pipeline + another indexadd_embedding_index('description', string_embed=...)
OrchestrationAirflow DAG + dependency config + monitoringAutomatic — insert triggers everything
VersioningCustom tracking across all servicesAutomatic — table.history(), table.revert()
Incremental updatesCustom diffing logic per serviceAutomatic — only new/changed rows process
See the full quickstart guide

The Anatomy of a Multimodal AI App

Clone the Starter Kit and ship a full-stack multimodal AI app in minutes. Upload docs, images, and videos — search across all of them — chat with an 8-step tool-calling agent.

pixeltable-starter-kit/
Use TemplateView Code
setup_pixeltable.py
backend
setup_pixeltable.py
5 Pipelines + Agent Workflow
main.py
FastAPI Server
functions.py
UDFs & Tool Definitions
config.py
Model Configuration
routers
frontend
docker-compose.yml
One-Command Deploy
Dockerfile
Multi-stage build
AGENTS.md
AI coding guide
1import pixeltable as pxt
2from pixeltable.functions import openai, image as pxt_image
3from pixeltable.functions.anthropic import messages, invoke_tools
4from pixeltable.functions.document import document_splitter
5from pixeltable.functions.huggingface import sentence_transformer, clip
6from pixeltable.functions.video import extract_audio, frame_iterator
7import config, functions
8
9pxt.create_dir("app", if_exists="ignore")
10sentence_embed = sentence_transformer.using(model_id=config.EMBEDDING_MODEL_ID)
11
12# ── 1. Document Pipeline ─────────────────────────────────────
13documents = pxt.create_table("app.documents", {
14 "document": pxt.Document, "timestamp": pxt.Timestamp,
15})
16chunks = pxt.create_view("app.chunks", documents,
17 iterator=document_splitter(document=documents.document,
18 separators="page, sentence", metadata="title, heading, page"),
19)
20chunks.add_embedding_index("text", string_embed=sentence_embed)
21
22@pxt.query
23def search_documents(query_text: str):
24 sim = chunks.text.similarity(query_text)
25 return chunks.where(sim > 0.5).order_by(sim, asc=False).limit(20)
26
27# ── 2. Image Pipeline ────────────────────────────────────────
28images = pxt.create_table("app.images", {
29 "image": pxt.Image, "timestamp": pxt.Timestamp,
30})
31images.add_computed_column(
32 thumbnail=pxt_image.b64_encode(pxt_image.thumbnail(images.image, size=(320, 320)))
33)
34images.add_embedding_index("image",
35 embedding=clip.using(model_id=config.CLIP_MODEL_ID))
36
37# ── 3. Video Pipeline ────────────────────────────────────────
38videos = pxt.create_table("app.videos", {
39 "video": pxt.Video, "timestamp": pxt.Timestamp,
40})
41video_frames = pxt.create_view("app.video_frames", videos,
42 iterator=frame_iterator(video=videos.video, keyframes_only=True))
43video_frames.add_embedding_index("frame",
44 embedding=clip.using(model_id=config.CLIP_MODEL_ID))
45videos.add_computed_column(audio=extract_audio(videos.video, format="mp3"))
46# audio → Whisper transcription → sentence splitting → embedding (chained views)
47
48# ── 4. Chat History ──────────────────────────────────────────
49chat_history = pxt.create_table("app.chat_history", {
50 "role": pxt.String, "content": pxt.String,
51 "conversation_id": pxt.String, "timestamp": pxt.Timestamp,
52})
53chat_history.add_embedding_index("content", string_embed=sentence_embed)
54
55# ── 5. Agent Pipeline (8-step tool-calling workflow) ─────────
56tools = pxt.tools(functions.web_search, search_video_transcripts)
57
58agent = pxt.create_table("app.agent", {
59 "prompt": pxt.String, "timestamp": pxt.Timestamp,
60 "initial_system_prompt": pxt.String,
61 "final_system_prompt": pxt.String,
62 "max_tokens": pxt.Int, "temperature": pxt.Float,
63})
64
65# Step 1: Initial LLM call with tool selection
66agent.add_computed_column(
67 initial_response=messages(
68 model=config.CLAUDE_MODEL_ID,
69 messages=[{"role": "user", "content": agent.prompt}],
70 tools=tools, tool_choice=tools.choice(required=True),
71 )
72)
73# Step 2: Execute selected tools
74agent.add_computed_column(tool_output=invoke_tools(tools, agent.initial_response))
75# Step 3: Parallel RAG context retrieval
76agent.add_computed_column(doc_context=search_documents(agent.prompt))
77agent.add_computed_column(image_context=search_images(agent.prompt))
78agent.add_computed_column(video_frame_context=search_video_frames(agent.prompt))
79agent.add_computed_column(chat_memory_context=search_chat_history(agent.prompt))
80# Steps 4-6: Assemble multimodal context + final messages
81agent.add_computed_column(
82 multimodal_context=functions.assemble_context(
83 agent.prompt, agent.tool_output, agent.doc_context, agent.chat_memory_context,
84 )
85)
86agent.add_computed_column(
87 final_messages=functions.assemble_final_messages(
88 agent.history_context, agent.multimodal_context,
89 image_context=agent.image_context, video_frame_context=agent.video_frame_context,
90 )
91)
92# Step 7: Final LLM reasoning
93agent.add_computed_column(
94 final_response=messages(model=config.CLAUDE_MODEL_ID, messages=agent.final_messages)
95)
96# Step 8: Extract answer
97agent.add_computed_column(answer=agent.final_response.content[0].text)

5 data pipelines + 8-step agent workflow — all declarative. One file replaces hundreds of lines of glue code.

For developers

Developing with AI Tools

Pixeltable's declarative API means AI coding assistants get it right on the first try. Ten lines of code gives you a persistent, versioned, incrementally-optimized pipeline.

Building with LLMsWhy vibe-coded apps break
llms.txt
Concise documentation for LLMs
llms-full.txt
Complete API reference for LLMs
MCP Server
Interactive Pixeltable exploration — tables, queries, Python REPL
Claude Code Skill
Deep Pixeltable expertise for Claude
AGENTS.md
Architecture guide for AI agents working with your codebase

Your Backend for Multimodal AI

pip install pixeltableYour entire AI data stack
Instead of ...Pixeltable gives you ...
PostgreSQL / MySQLpxt.create_table()— schema is Python, versioned automatically
Pinecone / Weaviate / Qdrantadd_embedding_index()— one line, stays in sync
S3 / boto3 / blob storagepxt.Image / Video / Audio / Document— native types with caching
Airflow / Prefect / CeleryComputed columns— trigger on insert — no orchestrator needed
LangChain / LlamaIndex (RAG)@pxt.query + .similarity()— computed column chaining
pandas / polars (multimodal).sample(), add_computed_column()— prototype to production
DVC / MLflow / W&Bhistory(), revert(), time travel— built-in snapshots
Custom retry / rate-limit / cachingBuilt into every AI integration— results cached, only new rows recomputed
Use Cases

What Can You Build?

Pixeltable's primitives compose into any multimodal AI workflow

Data Wrangling for ML

Curate, augment, and export training datasets. Pre-annotate with models, integrate Label Studio, export to PyTorch.

example.py
# Extract frames, run detection, export
frames.add_computed_column(
  objects=yolox(frames.frame)
)
pxt.io.export_parquet(frames)
Curate & AugmentPre-annotateExport PyTorchVersion Control
Learn more

Backend for AI Apps

Build RAG systems, semantic search, and multimodal APIs. Pixeltable handles storage, retrieval, and orchestration.

example.py
# Add embedding index for search
docs.add_embedding_index(
  'content', 
  embedding=openai.embeddings()
)
RAG PipelinesVector SearchMultimodal APIsAuto-sync
Learn more

Agents & MCP

The ultimate Agent Harness. Tool-calling agents with persistent memory, MCP server integration, and automatic conversation history.

example.py
# LLM with tools, Pixeltable executes
t.add_computed_column(
  response=openai.chat_completions(
    messages=t.msgs, tools=tools
))
Agent HarnessPersistent MemoryMCP IntegrationState Management
Learn more
Get Started in 5 MinutesExplore Public Datasets

Everything You Need to Know

Common questions about building with Pixeltable

Every Era of DataGets an Owner

Oracle for relational. Snowflake for analytics. Databricks for batch.
The multimodal data plane is next.

Start Building with Pixeltable10-Min Quickstart
PixeltablePixeltable Logo

The declarative data infrastructure for multimodal AI. Build production-ready video, image, and document workflows in minutes, not months.

GitHubXDiscordYouTubeLinkedIn

Product

  • Blog
  • Pricing
  • Changelog
  • GitHubOpen Source

Resources

  • Examples
  • Tutorials
  • API Reference
  • Pixelagent

Company

  • About
  • CareersHiring
  • Contact
  • Privacy

Get Started

  • Pixelbot
  • Starter Kit
  • Deployment Guide

© 2026 Pixeltable, Inc. All rights reserved.

Terms of ServicePrivacy PolicySecurity