Who is that cat

Multimodal track

Inspiration

Tracking a number of stray cats is a challenge in modern cities. Our goal was to facilitate this task by utilizing the natural propensity of people to take picture of the cats.

What it does

From a sent photo, identifies which stray cat it is from the database

How we built it

We used Meta's Llama via Together.ai (meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8) to determine if the photo was relevant (a cat) or not. Then in case of positive answer from the model, we pursued with image retrieval. We used the VGG-19 model to determine which cat the photo corresponds the most to in the database. Then asked the language model to formulate the answer, adding hypothetical information about the cat.