Inspiration

The inspiration for MediLocal came from a simple question: how can we make helpful AI accessible to everyone, not just those with stable internet? Living in an area where connectivity can be inconsistent, I wanted to build a tool that empowered users without compromising their privacy. The release of powerful local models like gpt-oss and the hackathon's "For Humanity" theme motivated me to create a practical, privacy-first health assistant that works entirely offline, ensuring a user's sensitive health data never leaves their computer.


How We Built It

The project followed a full-stack development cycle, from data to deployment:

  1. Data Processing: We started with a large medical dataset from Kaggle. A Python script was written to transform this structured CSV data into a conversational JSONL format suitable for fine-tuning.
  2. AI Fine-Tuning: Using Unsloth for its speed and memory efficiency, we fine-tuned OpenAI's gpt-oss-20b model on our prepared dataset. This specialized the general model for our specific medical diagnosis task.
  3. Local Serving: The fine-tuned model was quantized to GGUF and served locally using Ollama. This created a high-performance, offline API endpoint right on the user's machine.
  4. Application & Logic: A Next.js and TypeScript application provides the user interface. An API route integrates LangChain to structure prompts and ensure safe, consistent responses from the model.

Challenges We Faced

  • Hardware Limitations: Fine-tuning a 20B model is incredibly resource-intensive. We had to carefully manage our setup and acknowledge that this proof-of-concept requires a powerful GPU.
  • Model Configuration: Identifying the correct LoRA target_modules for the gpt-oss architecture was a critical and challenging step that required deep-diving into the model's structure.
  • Deployment Bugs: We encountered and solved practical issues like Git LFS for large datasets and Ollama's need for absolute file paths, which are common hurdles in real-world MLOps.

What We Learned

This project was a deep dive into the end-to-end lifecycle of a local AI application. We gained hands-on experience in parameter-efficient fine-tuning (PEFT), model quantization (GGUF), and the importance of a robust backend layer with tools like LangChain to safely guide a powerful LLM. Most importantly, we learned how to build a complete, functional product that addresses a real-world need.

Built With

Share this project:

Updates