Inspiration
Our inspiration came from the depth and complexity of Hindu scriptures and the desire to use AI as a tool for deeper engagement. We noticed generic language models often lack the specific stylistic nuances and conceptual understanding needed for these texts. We wanted to go beyond simple API calls and implement a solution ourselves – adapting a foundation model to become truly fluent in the language and style of the scriptures, potentially making these rich texts more accessible.
What it does
Our application loads 'hindu_scripture_pretrained', a language model we created by adapting Phi-2 specifically for Hindu scriptures through intensive pre-training on the Gautschi cluster. When given a prompt – like the start of a verse or a topic – it generates coherent, stylistically accurate text continuations that reflect the unique vocabulary and tone found in the source texts. It showcases the power of domain-specific AI adaptation for nuanced text generation.
How we built it
We implemented a full AI pipeline. First, we sourced and preprocessed diverse Hindu scripture texts. The core was developing and executing a custom continued pre-training script (pretrain.py) using Hugging Face Transformers and PyTorch. This demanding training ran via Slurm jobs on the Gautschi 'ai' partition GPUs, requiring careful environment configuration with specific Python, GCC, and CUDA modules, plus dependency management within a virtual environment. We debugged Slurm configurations for partition policies (like CPU/GPU ratios) and account flags. Finally, we built this demo (app.py) using 4-bit quantization for efficient loading, allowing interaction with our custom-trained model.
Challenges we ran into
Our main challenges revolved around the complexities of the HPC environment on Gautschi. We spent significant time debugging the Slurm job submission process – correctly specifying resource ratios for the 'ai' partition, resolving module dependencies (like needing GCC for specific Python versions), handling account flags, and ensuring the Python virtual environment activated correctly within the job. We also encountered and resolved filesystem write errors (Fsync failed, E212) related to storage limits or filesystem state, which required careful diagnosis. Overcoming these infrastructure hurdles was essential to successfully train our model.
Accomplishments that we're proud of
We're most proud of successfully implementing the entire domain adaptation pre-training pipeline and executing it on the Gautschi cluster within the hackathon timeframe. Despite significant environment and scheduling challenges, we created a unique language model specialized for Hindu scriptures (hindu_scripture_pretrained). This demonstrates a complete technical implementation that goes far beyond using pre-existing APIs or basic fine-tuning, directly fulfilling the goal of the 'Best AI-Driven Solution (No Wrapper Models)' category.
What we learned
This was a crash course in practical LLM training! We learned the nuances of adapting foundation models versus just fine-tuning, the critical importance of environment configuration (modules, venv) on HPC systems, and effective debugging strategies for Slurm jobs. We gained hands-on experience with Transformers, PyTorch, datasets, quantization techniques, and the specific challenges of deploying AI workflows on powerful cluster resources like Gautschi's.
What's next for DharmAI
The immediate next step is creating a high-quality, task-specific dataset (e.g., for Q&A or explanations) to fine-tune our pre-trained model, enabling it to perform more directed tasks accurately. We also envision building a more user-friendly web interface for broader access. Longer-term, we want to expand the training data, rigorously evaluate the model's outputs for accuracy and bias, and explore its potential as a valuable tool for students, researchers, and anyone interested in engaging more deeply with Hindu scriptures.
Log in or sign up for Devpost to join the conversation.