Inspiration
As a non-native English speaker from Vietnam, I grew up in specialized English classes where my most dreaded exercise was always 'word forms.' I hated guessing the relationship between words like 'happy,' 'happiness,' and 'unhappy.' I always dreamed of a tool that could just show me all the connections in a visual, intuitive way. WordFam is that dream—a tool to make language learning visual and accessible for students like me, built for my very first hackathon.
What it does
WordFam is a web application that transforms boring word lists into a beautiful, interactive graph. It’s a "family tree" for words.
Visualizes Connections: Instead of a list, you see a graph of morphological derivations (like 'create' → 'creation'), synonyms, and even complex compound words ('run' → 'runway').
Highly Interactive: The graph is fully interactive. You can drag nodes around to explore relationships.
Educational: You can click any word node to instantly get its definition and part-of-speech in a tooltip, making it a fast and educational tool for students.
How we built it
WordFam is built on a multi-source data pipeline to ensure high accuracy, with a FastAPI backend and a React frontend.
Backend (FastAPI): We combine five different NLP sources: WordNet (for reliable derivations), the Datamuse API (for associations and validation), a custom morphology engine (for rule-based prefixes/suffixes), semantic embeddings (for conceptual similarity), and a custom compound word database.
Core Innovation: The "brain" of our app is a smart etymology-based filtering system. This is crucial—it prevents false connections (like 'happy' and 'felicity' which have different roots) and ensures every word in the graph is validated for dictionary existence.
Performance: We use asyncio in Python to run all these API calls and validations in parallel, cutting response times from over 15 seconds to under 5.
Frontend (React): The frontend is built in React, using Cytoscape.js to render the fast, interactive, and force-directed graph.
Challenges we ran into
Our two biggest challenges were data quality and performance.
False Connections: Initially, our graph was "polluted." Our morphology engine generated "fake" words, and WordNet sometimes linked words with different origins (like 'happy' and 'felicity'). We solved this by building a double-validation system: 1) our custom etymology filter to check word roots, and 2) a Datamuse validator to ensure every single word exists in a dictionary.
Performance: Our 5-source pipeline meant 100+ sequential API calls per search, which took 15-20 seconds. This was unusable. We overcame this by re-architecting the entire backend to use asyncio and httpx, running all validations in parallel. This cut our response time to under 5 seconds.
Accomplishments that we're proud of
We're incredibly proud of building a tool that is not just functional but smart.
Our etymology-based filter is a real innovation that ensures data quality, which we believe is unique.
We successfully integrated five different NLP sources into a single, cohesive, and validated graph.
We cut our response time by over 75% (from 20s to <5s) by implementing parallel processing.
Most of all, as a first-time hackathon participant, I'm proud of building the exact tool I dreamed of having as a student.
What we learned
We learned that in data science, data is messy. Relying on a single source (like just WordNet) is not enough, as it can link words with different etymologies. You must validate, cross-reference, and filter. We learned the hard way that 'data quality' isn't just a buzzword; it's the most important feature, which is why we built our double-validation system. We also learned the immense power of asyncio for I/O-bound tasks—it's what made our 5-source pipeline usable by solving its 20-second load time.
What's next for WordFam
WordFam is just getting started. We have a clear roadmap:
More Languages: The pipeline is designed to be expandable. We want to add support for Spanish, French, and German next.
Historical Etymology: Right now, we use etymology to filter. We want to show it. Imagine clicking a node and seeing its full linguistic history.
User Accounts: We plan to add user accounts so students can save graphs, create custom word lists, and track their learning.
Browser Extension: We want to create a browser extension so you can highlight any word on any website and instantly generate a WordFam graph for it.
Log in or sign up for Devpost to join the conversation.