Inspiration
What it does
How I built it
- Download a scan of 1912 Laird and Lee's edition of Websters
- OCR all the pages (pyOCR)
- Match up continuous lines of definitions (lots of heuristics)
- Identify illustrations
- Identify syllables, and final word rhymes (using Pronouncing, CMU Pronounciation Dictionary, NLTK, and some guesses)
- Sort rhymes by closeness to iambic meter (stress/unstress)
- Format rhymes by popular schemes
- Ballade: ABABBCBC" -> "BCBC"
- Cinquain: "A,B,A,B,B"
- Alternate Rhyme: ABAB
- Limerick: AABBA
- (some short/long rules, based on what sound right more than any research)
- Generate poems, with title referring to the definitions
- Format generated poems as images (PIL)
- Dynamic title line breaks
- Pull out square-ish images from dictionary pages, create a mask, overlay onto header
Challenges I ran into
Accomplishments that I'm proud of
- identifying the actual definitions from the OCR
What I learned
What's next for Definitions
- Fix OCR, different source
- Page numbers
- Improve styling
- Share the code!
Log in or sign up for Devpost to join the conversation.