Inspiration
We wanted to build an AI-driven system that leverages vision-language models (VLMs) to enhance Tesla’s capabilities, possibly in autonomous systems, customer support, or AI-powered insights.
What it does
The project integrates LangChain with OpenAI’s GPT models (and potentially Groq’s models) to handle conversational AI, maintain session memory, and process vision-related inputs.
How we built it
We used LangChain to manage LLM interactions, session-based history storage, and potential multimodal processing through Groq’s vision-language models.
Challenges we ran into
Optimizing memory storage for multi-session interactions API constraints like token limits and response handling Tuning the model for accuracy in vision-language tasks
Accomplishments that we're proud of
Successfully implemented session-based chat memory Integrated OpenAI’s GPT-4o-mini for fast, efficient responses Explored Groq’s vision-language models for potential enhancements
What we learned
How to efficiently manage session-based memory in LangChain The strengths and limitations of different LLM providers Strategies for optimizing AI models in constrained environments
What's next for Tesla VLM Track
Fine-tuning the model for specific Tesla-related applications Expanding the use of vision-language models for real-time insights Exploring hardware acceleration for on-device AI processing
Log in or sign up for Devpost to join the conversation.