Inspiration

Large language models trained on broad web-scale data often struggle with statutory and regulatory domains, where rules are precise, conditional, and frequently updated. While testing general-purpose models on Indian income tax questions, we observed repeated factual inaccuracies, particularly when distinguishing between the Old Tax Regime and the New Tax Regime.

A recurring failure case involved Section 80C, where models incorrectly claimed that deductions were available under the new tax regime. This highlighted a deeper issue: models were relying on outdated or conflicting web-era information rather than current statutory rules. This motivated us to explore whether parameter-efficient fine-tuning could be used to correct such high-confidence factual errors.


What it does

This project fine-tunes ERNIE-4.5 to answer Indian income tax regime questions accurately for Assessment Year 2024–25.

The fine-tuned model:

  • Correctly differentiates between Old and New Tax Regimes
  • Avoids hallucinating deductions under the new tax regime
  • Produces concise, regime-aware, and legally consistent answers
  • Retains general language understanding while correcting domain-specific facts

How we built it

We used LoRA-based parameter-efficient fine-tuning via Unsloth to adapt ERNIE-4.5 using a small, carefully curated dataset.

Key implementation details:

  • Structured prompts in an ERNIE-style format:

    Entity → Attributes → Query → Answer

  • Answer-only supervision, where loss is applied only to answer tokens

  • Explicit Old vs New regime contrast pairs to break pretraining biases

  • LoRA adapters applied to attention and feed-forward layers for efficiency

This approach allowed us to override incorrect pretraining priors while training less than 1% of the model’s parameters.


Challenges we ran into

  • Conflicting tax rules embedded in web-trained model priors
  • Dependency instability in Colab due to rapidly evolving ML libraries
  • Ensuring correct alignment for answer-only loss during tokenization
  • Preventing overfitting while working with a small dataset

These challenges required careful dataset design and controlled fine-tuning rather than brute-force scaling.


Accomplishments that we're proud of

  • Successfully corrected high-confidence factual errors in the base ERNIE model
  • Achieved clear behavioral improvement using a small, targeted dataset
  • Maintained model fluency while improving legal and regulatory accuracy
  • Demonstrated the effectiveness of LoRA for domain-specific factual correction

What we learned

  • Fine-tuning is not just about adding knowledge, but correcting confidence
  • Small, high-quality datasets can outperform large noisy corpora
  • Prompt structure plays a critical role in factual reliability
  • Parameter-efficient methods are well-suited for regulatory domains

What's next for ERNIE fine-tuning using Unsloth

  • Expanding coverage to additional tax sections and assessment years
  • Incorporating retrieval-based verification for statutory references
  • Adding confidence calibration for regulatory responses
  • Deploying the model as an interactive tax advisory assistant

Built With

Share this project:

Updates