Research at the edge of video, vision-language models, and agents.

Research projects, papers, preprints, talks, and references from the VideoDB research archive.

All /
Research
Read Do Thought Streams Matter? Evaluating Reasoning in Gemini Vision-Language Models for Video Scene Understanding

#002 · arXiv preprint arXiv:2604.11177 · 2026/4/13

Do Thought Streams Matter? Evaluating Reasoning in Gemini Vision-Language Models for Video Scene Understanding

Shivam Sharma, Sankalp Nagaonkar, Ashish Choithani, Ashutosh Trivedi

Benchmarks how internal reasoning traces affect video scene understanding in Gemini models, including where quality gains plateau and how tight budgets increase compression-step hallucination.

Read Chitrakāvya

#007 · Research project · Ongoing

Chitrakāvya

Ashutosh Trivedi

Sanskrit picture-poems rebuilt as computational artifacts, connecting sloka, geometry, knight's tours, palindromes, and early algorithmic thinking.

Read Emergence

#008 · Research project · Ongoing

Emergence

Ashutosh Trivedi

Notes on multi-agent systems, collective intelligence, alignment, swarm behavior, and computational models of consciousness.