Inspiration
Financial reports are dense, repetitive, and hard to digest β especially SEC 10-K filings. I wanted to make reading financial data faster, smarter, and more interactive by fine-tuning a language model specifically on real financial documents.
What It Does
MaskMind: Financial Mask-Filler MLM allows users to enter partial financial sentences with [MASK] tokens, and the model intelligently predicts the missing information based on Apple's SEC 10-K filings.
Itβs like "autocomplete" β but for financial experts.
How I Built It
- Data Collection: Downloaded Apple's SEC 10-K annual reports
- Preprocessing: Cleaned, tokenized, and chunked the raw text
- Model Fine-Tuning: Started with BERT (fine-tuned on SQuAD), then custom fine-tuned further using masked language modeling (MLM) on Apple's financials
- Gradio App: Built a secure, private app that users can interact with directly without needing an internet connection
Challenges I Faced
- Dealing with the size and complexity of 10-K filings
- Avoiding generic predictions by re-training on specialized domain-specific financial language
- Hosting large model checkpoints outside GitHub without losing accessibility
Accomplishments I'm Proud Of
- Built a fully private, secure financial domain MLM
- Trained a fine-tuned masked language model from scratch
- Created a professional, clean Gradio web interface ready for real-world use
What's Next
- Expand to other companies beyond Apple
- Add multi-company or industry-specific models
- Build lightweight financial Q&A systems alongside the mask-filler
Built With
- 10-k
- 3.12
- apple
- custom
- docker-ready
- filings
- fine-tuning
- gradio
- huggingface
- python
- scripts
- sec
- transformers
Log in or sign up for Devpost to join the conversation.