Smart query interface for your videos. The idea is by using the capabilities of LLMs, decompose the natural language query into a sequence of steps and then use existing foundational models, image processing, or arithmetic and logical operations to execute these steps. "Show me scenes of person A and B talking about X", "Clips of player X doing Y (sports and gaming use cases)"

Built With

Share this project:

Updates