Inspiration
New workflow can be streamline with multimodal AI. It can for example remove the burden of admin work for the users who are responsible for labor-intensive processes like manufacturing. I wanted to explore how multimodal agent can take care of mundane tasks like data entry, correction, data matching and correction. The idea was to explore a vertical industry that lives in the world of atoms that may be suffering more from that.
What it does
Let workers at plants that manage inventory report issues with procured components. The report could be multimodal (i.e., voice call or image upload) The multi agent running on the server will compare the input data against what exists (i.e., component image + metadat and contract data) in the database, find the data about the component, and reference the supplier contract in pdf for further information. It then uses the contract to inform the worker that reported on the next steps based on the agreement with the supplier.
How we built it
Stytch + next.js for the image upload frontend. HappyRobot for the call intake and outbound. The call information will be passed to the flask backend, where agents that run with Agno + gemini analyzes the data. 3 agents, one for matching the data with the component db, one for finding the right supplier contract, and one for generating immediate action for the worker are used to complete the analysis. Search is powered with text embedding from together.ai, CLIP-based image embedding, and storage with activeloop. Supplier contracts were also stored to activeloop using Llamaindex + together.ai.
Accomplishments that we're proud of
The voice input flow works end-to-end.
What we learned
Improving the accuracy of the model takes time with a lot of monitoring and tuning needed.
What's next for InstantResolution
- Improve accuracy of all models
- Integrate SOP to the workflow
- Expand # of data and test accuracy
- Improve the use of image + text similarity search
- Add more actions e.g., calling procurement manager, suppliers, generating reports
Log in or sign up for Devpost to join the conversation.