I am studying Electrical and Computer Engineering with minors in Military Studies and Arabic. This summer, I had the incredible opportunity to study in Morocco and live in Jordan, where I experienced firsthand how challenging Arabic can be for native English speakers.Especially the concept of short vowels (ḥarakāt). Arabic also includes several sounds that don’t exist in English, which adds to the difficulty. To explore this challenge, I trained a machine learning model to recognize the letter ʿAyn (ع) with its short vowel (ḥaraka) variations. Although I consider myself more of a hardware-focused person, I’m eager to strengthen my programming skills. I am passionate about both language learning and the military, and my long-term goal is to build programs that can accelerate the Arabic learning process, helping others overcome the same challenges I faced. When I started this project, I had no prior experience with machine learning or audio processing, so I had to learn every part of the process from the ground up. I began by collecting my own dataset, recording the Arabic letter ʿAyn (ع) with its three short vowel variations: fatḥa, kasra, and ḍamma. To make the model more robust, I also asked friends to contribute their voices so it could learn from different tones and pronunciations. I organized these recordings into labeled folders for each ḥaraka. Since machine learning models need consistent input, I wrote a preprocessing function that loaded each audio file, trimmed the silence from the start and end, normalized the volume, and cropped or padded the clip to a fixed length. After cleaning the audio, I extracted MFCC (Mel-Frequency Cepstral Coefficients) features, which capture the shape of the sound and are especially useful for distinguishing vowel sounds. Because my dataset was small, I also used data augmentation to create new, slightly modified versions of each clip by changing the pitch, speed, or adding soft background noise. I then trained a simple Logistic Regression classifier using scikit-learn to predict which ḥaraka was present in each clip. Finally, I built a graphical interface using Tkinter that allows users to select an audio file and see the predicted ḥaraka along with a confidence score. Through this process, I learned how to collect and label data, preprocess and normalize audio, extract meaningful features, train and evaluate a machine learning model, and wrap it all in a user-friendly interface. As someone who has always considered herself more of a hardware-focused engineer, this project gave me valuable experience in programming, software design, and machine learning, and showed me that I can create practical tools that connect my interests in Arabic and technology.

Built With

Share this project:

Updates