Language-Identification

Description

Identify the Nigerian language in which a text is written. This Machine Learning problem focuses on classifying a Nigerian Text into its language class. The languages are modelled using character Ngrams as language features and the algorithm used for classification is the Mutual Cross Entropy Algorithm. The model is then deployed using a local web application. Classification accuracy of upto 80% is recorded.

Files

The files in this repository include:

  1. CODES: This contains the python files with the modelling (Language_Modeling) and classification (Language_Classifier) functions. It also contains the files used for training.
  2. TRAINING FILES: Contains the files used for training.
  3. TEST FILES: Contains files used for validation.
  4. WEB APPLICATION: Contains files used for the web app.

Built With

Share this project:

Updates