WordFam

Inspiration

As a non-native English speaker from Vietnam, I grew up in specialized English classes where my most dreaded exercise was always 'word forms.' I hated guessing the relationship between words like 'happy,' 'happiness,' and 'unhappy.' I always dreamed of a tool that could just show me all the connections in a visual, intuitive way. WordFam is that dream—a tool to make language learning visual and accessible for students like me, built for my very first hackathon.

What it does

WordFam is a web application that transforms boring word lists into a beautiful, interactive graph. It’s a "family tree" for words.

Visualizes Connections: Instead of a list, you see a graph of morphological derivations (like 'create' → 'creation'), synonyms, and even complex compound words ('run' → 'runway').

Highly Interactive: The graph is fully interactive. You can drag nodes around to explore relationships.

Educational: You can click any word node to instantly get its definition and part-of-speech in a tooltip, making it a fast and educational tool for students.

How we built it

WordFam is built on a multi-source data pipeline to ensure high accuracy, with a FastAPI backend and a React frontend.

Backend (FastAPI): We combine five different NLP sources: WordNet (for reliable derivations), the Datamuse API (for associations and validation), a custom morphology engine (for rule-based prefixes/suffixes), semantic embeddings (for conceptual similarity), and a custom compound word database.

Core Innovation: The "brain" of our app is a smart etymology-based filtering system. This is crucial—it prevents false connections (like 'happy' and 'felicity' which have different roots) and ensures every word in the graph is validated for dictionary existence.

Performance: We use asyncio in Python to run all these API calls and validations in parallel, cutting response times from over 15 seconds to under 5.

Frontend (React): The frontend is built in React, using Cytoscape.js to render the fast, interactive, and force-directed graph.

Challenges we ran into

Our two biggest challenges were data quality and performance.

False Connections: Initially, our graph was "polluted." Our morphology engine generated "fake" words, and WordNet sometimes linked words with different origins (like 'happy' and 'felicity'). We solved this by building a double-validation system: 1) our custom etymology filter to check word roots, and 2) a Datamuse validator to ensure every single word exists in a dictionary.

Performance: Our 5-source pipeline meant 100+ sequential API calls per search, which took 15-20 seconds. This was unusable. We overcame this by re-architecting the entire backend to use asyncio and httpx, running all validations in parallel. This cut our response time to under 5 seconds.

Accomplishments that we're proud of

We're incredibly proud of building a tool that is not just functional but smart.

Our etymology-based filter is a real innovation that ensures data quality, which we believe is unique.

We successfully integrated five different NLP sources into a single, cohesive, and validated graph.

We cut our response time by over 75% (from 20s to <5s) by implementing parallel processing.

Most of all, as a first-time hackathon participant, I'm proud of building the exact tool I dreamed of having as a student.

What we learned

We learned that in data science, data is messy. Relying on a single source (like just WordNet) is not enough, as it can link words with different etymologies. You must validate, cross-reference, and filter. We learned the hard way that 'data quality' isn't just a buzzword; it's the most important feature, which is why we built our double-validation system. We also learned the immense power of asyncio for I/O-bound tasks—it's what made our 5-source pipeline usable by solving its 20-second load time.

What's next for WordFam

WordFam is just getting started. We have a clear roadmap:

More Languages: The pipeline is designed to be expandable. We want to add support for Spanish, French, and German next.

Historical Etymology: Right now, we use etymology to filter. We want to show it. Imagine clicking a node and seeing its full linguistic history.

User Accounts: We plan to add user accounts so students can save graphs, create custom word lists, and track their learning.

Browser Extension: We want to create a browser extension so you can highlight any word on any website and instantly generate a WordFam graph for it.

Built With

css3
cytoscape.js
datamuse
dictionary
fastapi
free
free-dictionary
git/github
javascript/jsx
nltk
pydantic
react
sentence
transformers
uvicorn
vite
wordnet

Updates

Trần Huỳnh Hạ Lam started this project — Nov 16, 2025 12:13 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.