Skinalor——Offline Dermatology Assistant (VLM + RAG, Android Edge)
Research Aim To develop a multimodal dermatology assistant that integrates Retrieval-Augmented Generation (RAG) and parameter-efficient fine-tuning to deliver reliable, domain-specific QA with offline, on-device deployment. Skinalor combines image and text inputs, leveraging a multimodal model for preliminary screening. It incorporates a Retrieval Augmented Generation (RAG) mechanism to enhance accuracy and explainability. The key innovations include: RAG-enhanced medical reasoning; optimized semantic retrieval performance; edge deployment through model compression; integration of multi turn dialogue to simulate real-world medical consultations. Privacy-preserving diagnosis support on Android by combining a Vision-Language Model with Retrieval-Augmented Generation for fully offline inference. Models & Fine tuning: benchmarked MedGemma-4B-IT / Lingshu-7B / MedVLM-R1; parameter-efficient fine-tuning with LoRA (r=8, α=16, dropout=0). RAG: CLIP-based image/text embeddings in ChromaDB, CLIP-IQA for image-quality filtering, and Text-Prompt Fusion for evidence-grounded answers. Compression & Export: 4-bit quantization (AWQ/WOQ, block=64); exports via MNN / ONNX / TFLite / AI-Edge-Torch for edge inference. Prototype: Android pipeline capture → retrieval → VLM reasoning → explanation; zero network dependency and privacy preserving. Stack: Python, PyTorch, scikit-learn, CLIP, ChromaDB, PyIQA, MNN, ONNX/TFLite, Android Project Overview: https://www.linkedin.com/feed/update/urn:li:activity:7360960454346649600/
Log in or sign up for Devpost to join the conversation.