RhymeCheck: A Neural Network Built from Scratch

Inspiration

RhymeCheck was inspired by the desire to truly understand how neural networks work internally rather than treating them as black-box systems. By removing all abstractions provided by modern ML libraries, this project focuses on learning neural network fundamentals through hands-on implementation from scratch.

Project Description

RhymeCheck is a neural network implemented entirely from scratch in Python to classify whether two English words rhyme. The project is intentionally designed without the use of machine learning, deep learning, or natural language processing libraries. Instead, all learning behavior is constructed manually using core programming fundamentals, making the internal workings of the model fully transparent and interpretable.

System Working and Overview

The system operates by first transforming each input word into a fixed-length numerical representation. Since rhyming is primarily influenced by word endings, only the last four characters of each word are encoded. These encodings are concatenated to form an eight-dimensional input vector, which serves as the input to the neural network.

This input vector is processed by a fully connected feedforward neural network consisting of one hidden layer and one output layer. Each neuron computes a weighted sum of its inputs, adds a bias term, and applies a sigmoid activation function. The final output represents the model’s confidence that the two words rhyme.

Components Built From Scratch

All major neural network components are implemented manually, including:

Character-level feature encoding
Weight and bias initialization
Forward propagation through network layers
Loss computation using Mean Squared Error
Backpropagation for gradient calculation
Weight and bias updates using gradient descent

The implementation relies solely on Python’s standard library, using only loops, lists, conditionals, and basic mathematical operations.

Core Logic and Learning Mechanism

The core learning mechanism enables the network to adjust its weights and biases based on prediction errors observed during training. Through repeated exposure to labeled word pairs, the hidden layer learns internal representations of phonetic patterns that are indicative of rhyming behavior. This allows the model to generalize beyond memorized examples and make informed predictions on unseen word pairs.

Engineering Challenges and Design Decisions

Key challenges addressed in this project include designing an effective numerical encoding scheme for textual input, implementing backpropagation correctly without relying on frameworks, managing non-linear activations, and ensuring stable learning behavior using simple gradient-based optimization. Design decisions were guided by clarity, interpretability, and strict adherence to first-principles implementation.

Model Improvement

While RhymeCheck successfully demonstrates a neural network built from scratch, several improvements can further enhance its accuracy, robustness, and scalability:

Enhanced Encoding Strategy

Replacing simple character-level encoding with phonetic-aware representations, such as grouping consonants by sound or approximating syllable endings, to better capture true rhyming patterns.

Expanded and Balanced Dataset

Increasing the number and diversity of training examples to reduce bias and false positives. A more balanced dataset would improve generalization across different word structures.

Hyperparameter Tuning

Experimenting with learning rate, number of hidden neurons, and training epochs to improve convergence stability and prediction confidence.

Alternative Loss Functions

Replacing Mean Squared Error with Binary Cross-Entropy, which is better suited for binary classification tasks and can lead to more stable learning.

Deeper Network Architecture

Introducing additional hidden layers or neurons to allow the model to learn more complex non-linear relationships between character patterns.

Regularization Techniques

Applying techniques such as weight decay to reduce overfitting and improve performance on unseen inputs.

Explainability Enhancements

Adding step-by-step logging or visualization of neuron activations and weight updates to further improve transparency and educational value.

Conclusion

RhymeCheck demonstrates a complete end-to-end neural network system built from first principles. While the model is intentionally simple, it successfully captures the fundamental mechanics of neural computation and learning.

What I learned

This project provided hands-on understanding of how neural networks learn through weight and bias updates, how activation functions introduce non-linearity, and how small design decisions significantly impact model behavior. It also reinforced the importance of interpretability and fundamental knowledge over reliance on high-level frameworks.

What’s next for RhymeCheck: A Neural Network Built from Scratch

Future improvements include expanding the dataset, improving the encoding scheme to better capture phonetic features, experimenting with alternative loss functions, and extending the architecture to handle sentence-level rhyming or multi-language inputs.

Built With

python

Updates

Jyothsna Lakshminarayanan started this project — Dec 21, 2025 12:02 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.