Inspiration

While trying to know about a Qloo, I came across a documentaion of their platform 'Insights by Qloo'. At this platform, one can generate cultural insights by a textual prompt. I felt it should also have feature to allow for multimodal input as prompt or prompt support. With this thought, Qlootient became a first step towards making a complete multimodal cultural intelligence suite.

What it does

We are starting this product by exploring what cultural context can be derived from just images. Different images of same thing, person, place etc. can have different cultural signatures. With Qlootient, we are trying to reveal them quantitatively using Qloo Taste AI API.

How we built it

It was built in quite a simple steps:

  1. Figuring out how to extract a starting cultural hints from Image: It was done using Gemini API LLM.
  2. Utilizing extracted hints into quantifiable cultural information: Done using Qloo Taste AI API.
  3. Presenting it in a lightweight app: Thanks to Gradio!

Challenges we ran into

While it was amazing to see rich collection of parameters and output fields to work upon in Qloo Taste AI API, it was little challenging to figure out and pick precisely the ones that should be used for concise and coherent application.

Accomplishments that we're proud of

Could complete the project on time.

What we learned

Everything about Qloo and altogether newer landscape of cultural intelligence of nearly everything.

What's next for Qlootient

  1. As intended to make it multimodal, worthy of trying to add Video and Audio input as well.
  2. Enrich LLM variety by adding other LLM models as well.(e.g. GPT-4o, Claude. Deepseek etc.)
  3. Adding more features to play with images (e.g. Segmentaion etc.)
  4. And many more general features to turn it into a complete cultural intelligence suite with Multimodal Capabilties.......

Built With

Share this project:

Updates