Inspiration
Team Rudra was inspired by a simple but persistent problem we all faced while using complex software: learning slows us down more than the task itself. Whether it was design tools, development environments, or enterprise platforms, we found ourselves repeatedly switching between the application and external tutorials, videos, or documentation. Despite the rise of AI assistants, most solutions remained confined to chat windows and could not help users perform actual actions inside the software. This gap between knowing and doing motivated us to rethink how humans learn software in real time.
What it does
Our project, Instructly, is a GenAI-powered real-time guidance system that helps users perform tasks directly inside software applications. Instead of reading instructions or watching videos, users receive step-by-step, visual, on-screen guidance while working. The system understands user intent in natural language and translates it into actionable guidance, visually highlighting buttons, menus, and UI elements. This eliminates context switching and allows users to learn by doing.
How we built it
We built Instructly using a combination of:
- Large Language Models (LLMs) for understanding user intent and generating contextual instructions
- Prompt engineering and intent classification to convert natural language queries into structured actions
- System-level UI mapping to detect active application elements
- Visual pointer overlays to guide users through real-time interactions
The architecture follows a modular approach, enabling integration through an SDK so that the system can be embedded directly into applications. The overall goal was to move from a traditional chatbot model to an action-driven AI system that interacts with software interfaces.
Challenges we ran into
One of the major challenges was bridging the gap between AI-generated instructions and real UI interactions. Mapping abstract user intent to precise interface elements required careful handling of UI context and edge cases. Ensuring that guidance remained accurate across different workflows was another challenge. We also faced design challenges in keeping the guidance intuitive without overwhelming the user.
Accomplishments that we're proud of
- Successfully designed a real-time, visual guidance system instead of a text-only assistant
- Reduced dependency on external tutorials by enabling in-app learning
- Built a scalable and modular architecture suitable for multiple software platforms
- Demonstrated how GenAI can move beyond chat and into direct action assistance
What we learned
Through this project, we learned that effective AI systems are not just about generating correct answers, but about integrating intelligence into user workflows. We gained hands-on experience in system design, GenAI integration, and user-centric problem solving. Most importantly, we learned how powerful learning becomes when users are guided through actions rather than explanations.
What's next for Team Rudra
Moving forward, Team Rudra plans to expand Instructly into a full-fledged SDK that can support multiple platforms and domains. Future work includes improving UI detection accuracy, adding adaptive personalization, and conducting larger user studies to measure impact. Our long-term vision is to make all complex software self-explanatory, enabling users to achieve expert-level productivity from day one.
Built With
- client
- cloud
- customtkinter
- firebase-hosting
- firestore
- google-cloud
- google-cloud-functions
- google-cloud-text-to-speech
- google-cloud-vision-api-(document-ai)
- here-is-the-updated-list-using-google-cloud-platform-(gcp)-services-instead-of-aws:-python
- llama-3-(quantized)
- windows-ui-automation-(pywinauto)
Log in or sign up for Devpost to join the conversation.