Skip to main content

Architecture

Iris is organized as a multi-surface application with a frontend editor, a backend API, AI adapters, and an optional GPU worker.

Repository map

  • frontend/: React, Vite, route surfaces, editor UI, project browsing, continuity UI
  • backend/: FastAPI routes, database models, workers, media-processing pipeline
  • ai/: provider adapters, prompts, AI-side configuration and tests
  • cli/: command-line interface wrapping the backend API, includes multi-agent video analysis
  • gpu-worker/: optional segmentation and vision-heavy worker
  • infra/: local and deploy infrastructure helpers

Frontend responsibilities

The frontend owns:
  • navigation between landing, projects, and editor
  • upload initiation
  • project browsing
  • selection interactions
  • generation controls
  • continuity workflow surfaces

Backend responsibilities

The backend owns:
  • upload ingestion
  • project persistence
  • generation jobs
  • identification and mask endpoints
  • propagation and timeline updates
  • narration and export flows

Media-processing layer

The FFmpeg service is the bridge between uploaded media and actual editorial operations such as probing, clipping, stitching, and keyframe extraction.

AI layer

The AI layer abstracts provider-specific logic behind product-level operations. That lets the rest of the system talk about product actions such as identify, generate, narrate, and propagate instead of talking directly to provider SDKs everywhere.

Agent layer

The agent layer powers in-app conversational editing through Gemini function calling. When a user sends a message via agent chat, the backend opens an SSE stream and lets Gemini decide which tools to invoke — identify, mask, generate, propagate — based on the user’s natural language request. Tool calls and their results are streamed back as structured events so the frontend can visualize each step (tool call start, tool call end, suggestion, variant ready). This keeps the user informed while the agent orchestrates multi-step editing workflows autonomously.

Conversations

Agent chat history is persisted in Postgres through two models:
  • Conversation — belongs to a project, has a title and timestamps
  • ChatMessage — belongs to a conversation, stores role (user/assistant/system) and content
This lets users resume previous agent sessions, review what the agent did, and maintain context across editing sessions within the same project.

CLI

The cli/ directory provides a command-line interface that wraps the backend API. It supports project listing, uploads, and triggering edits from the terminal. The CLI also includes a multi-agent video analysis mode where multiple AI agents can collaboratively analyze and plan edits for a video, useful for batch processing and automated workflows.

CI/CD

The project uses GitHub Actions for continuous integration and deployment:
  • Typecheck — runs TypeScript strict-mode checks on the frontend
  • Tests — runs backend and frontend test suites
  • Auto-deploy — pushes to Vultr on merge to main
  • Docs sync — automatically deploys documentation updates to Mintlify