Building the video backend for agents.

Field notes, build stories, research, and working projects from video infrastructure, retrieval, streaming, and agent tooling.

01 /
Field Notes

Recent production lessons
and technical notes.

Browse short notes on infrastructure, reliability, latency, retrieval, and the small fixes that matter in production.

02 /
Build Notes

Build notes
from working VideoDB projects.

Follow the architecture choices, API decisions, and tradeoffs behind apps, demos, and agent tools built with VideoDB.

03 /
Newsletter

Newsletters
from the VideoDB engineering team.

Read concise updates, technical notes, and build context from the team working on video infrastructure and agents.

04 /
Research

Research at the edge
of video and agents.

Papers, evaluations, talks, and notes on video understanding, retrieval, multimodal models, and agent systems.

Read Do Thought Streams Matter? Evaluating Reasoning in Gemini Vision-Language Models for Video Scene Understanding

arXiv preprint arXiv:2604.11177 · 2026/4/13

Do Thought Streams Matter? Evaluating Reasoning in Gemini Vision-Language Models for Video Scene Understanding

Shivam Sharma, Sankalp Nagaonkar, Ashish Choithani, Ashutosh Trivedi

Benchmarks how internal reasoning traces affect video scene understanding in Gemini models, including where quality gains plateau and how tight budgets increase compression-step hallucination.

Read on arXiv