Inspiration
We saw google and compared it to chatgpt and we wanted to combine both of their positive aspects: Energy efficient + Smart
What it does
chat interface that combines a large and a small llm to make one energy efficient system. by default the small llm runs but if a token generated excedes a certain perplexity threshold then the token is deleted and replace with a token generated from the large llm then it switches back
How we built it
We built it through Python, Html, css and Javascript, we worked with Qwen 3 4b 2507 and a ollama model
Challenges we ran into
Image support and online search
Accomplishments that we're proud of
We are happy that we were able to show what text what generated from the large llm and the small llm
What we learned
We learned more about the tokenization process and how its
What's next for fih
Next we will add more models so there will be more large llms and small llms
Log in or sign up for Devpost to join the conversation.