🌾 Crop Recommendation System

The dataset is taken from kaggle from https://www.kaggle.com/datasets/madhuraatmarambhagat/crop-recommendation-dataset

📝 Problem Statement

Farmers often struggle to select the most suitable crop for cultivation due to changing soil and environmental conditions. This project builds a machine learning model that recommends the most appropriate crop based on key parameters like soil nutrients, temperature, humidity, pH, and rainfall.

The goal is to help farmers make data-driven decisions to improve productivity and sustainability.

📊 Dataset Description

The dataset used is Crop_recommendation.csv and contains the following columns:

Features (Inputs):

N: Nitrogen content in soil
P: Phosphorus content in soil
K: Potassium content in soil
temperature: Temperature in °C
humidity: Relative humidity in %
ph: pH value of the soil
rainfall: Rainfall in mm

Target (Output):

label: The recommended crop to grow (e.g., rice, maize, chickpea, etc.)

🚀 Project Workflow

1. Data Reading

Loaded the dataset using Pandas.

2. Data Evaluation

Displayed basic statistics and checked for null or missing values.
Confirmed data types and feature ranges.

3. Data Visualization

Used box plots to visualize each feature and identify outliers.

4. Feature Importance

Used Random Forest to identify which features most influence crop recommendation.

5. Label Encoding

Converted the crop labels (strings) into numeric format using LabelEncoder.

6. Feature and Target Separation

Split the dataset into:
- X → input features: [N, P, K, temperature, humidity, ph, rainfall]
- y → target crop label

7. Outlier Removal

Removed extreme values based on box plot analysis to improve model robustness.

8. Apply SMOTE

Balanced the dataset using SMOTE to generate synthetic examples of minority crop classes.

9. Preprocessing

Applied standard scaling to normalize feature values, which improves model performance (especially for SVM).

10. Model Training

Trained the following models:
- SVM (Support Vector Machine)
- Random Forest Classifier

11. Cross-Validation

Performed 5-fold cross-validation for both models:
- SVM Average Accuracy: 97.3%
- Random Forest Average Accuracy: 99.4% ✅ (Best Model)

12. Model Saving

Saved the best-performing model (Random Forest) using pickle as crop_model.pkl.

🌱 Solution

Crop Recommendation System

An app that helps farmers or users find the best crop to grow based on soil nutrients and weather conditions. It uses a machine learning model to predict the most suitable crop by analyzing inputs like nitrogen, phosphorus, potassium levels, temperature, humidity, soil pH, and rainfall.

The app has three main pages:

Home: Introduces the system and explains what features are used for crop prediction.
Predict Crop: Allows users to input soil and climate data to get a recommended crop.
Data Info: Shows an overview of the dataset used to train the model, including sample data, statistics, and crop distribution.

Built With

googlecolab
imblearn
kaggle
matplotlib
numpy
pandas
pickle
python
randomforestclassifier
scikit-learn
seaborn
streamlit
svm

Updates

Keerthana Kothapelly started this project — May 30, 2025 11:07 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.