## What it does
The Data Preprocessing part plays a major role in providing a good accuracy model. Also a lot of people, especially the ones who are new to the ML field, spend a lot of time preprocessing the dataset. DataCLI aims in reducing the effort and time in the data preprocessing phase of an ML project. It uses CLI commands with the help of CLICK and applies all the preprocessing methods such as Label Encoding, Simple Imputation, etc... to the dataset and saves the final dataframe into a csv file which can be used for training the model.
Techstacks used
- Python
- Numpy
- Scipy
- Scikit-learn
Challenges we ran into
I was having a hard time implementing the Figlet function but was able to do it successfully in the end.
Accomplishments that we're proud of
With this, the preprocessing part of the dataset will be a piece of cake.
What we learned
Integration with command line interface
What's next for DataCLI
- Preprocessing of images
- Preprocessing for NLP datasets
Log in or sign up for Devpost to join the conversation.