Inspiration: Access to finance is a major challenge for smallholder farmers.
Traditional credit systems often rely on collateral and lengthy paperwork, excluding rural farmers who may lack these resources. We asked ourselves: What if data science could bridge the gap? What if we could predict whether a farmer is creditworthy using only basic socio-economic features?
This project was born out of the belief that AI can drive financial inclusion for underserved communities. Across Africa, millions of smallholder farmers struggle to access credit. Banks and lenders see them as high risk, mainly because there is no structured way to evaluate their creditworthiness. As a result, many farmers can’t invest in seeds, fertilizers, or equipment, and this limits food security and economic growth.
What it does: We built a machine learning pipeline that processes farmer data, trains predictive models like Logistic Regression and Random Forest, and evaluates their performance. We then deployed the best-performing model into a Streamlit app, where users can interact with an AI chatbot.
The chatbot allows lenders to ask natural questions like: ‘Will a 40-year-old woman in rural Nigeria with secondary education qualify for a loan?’ — and it instantly provides a prediction with confidence scores
How we built it: The process followed a full ML lifecycle:
Data Preprocessing & Feature Engineering Cleaned farmer survey data. Encoded categorical variables like education level, phone ownership, and location sector. Engineered new features (e.g., farmer years lived in area, women’s support access).
Model Training Trained multiple models: Logistic Regression, Decision Tree, Random Forest Selected Logistic Regression for the chatbot due to interpretability. Final performance achieved accuracy ≈ 0.91 on validation. The prediction model estimates P(loan approval∣X)=σ(WX+b) where σ is the logistic sigmoid activation.
Visualization & Insights Built Power BI dashboards for lender analytics. Visualized model performance, feature importance, and farmer distribution.
Chatbot Integration Designed a natural language parser (regex + keyword mapping). Converted questions like "Will a 40-year-old farmer in rural area with secondary education get a loan?" into feature vectors. Returned predictions with confidence levels.
Deployment Packaged the solution in Streamlit with multiple sections: Farmer Credit Profile Lender Dashboard Insights & Visualizations AI Chatbot
Challenges we ran into: Data Quality Issues: Many survey responses had missing or inconsistent values.
Imbalanced Classes: More “eligible” than “ineligible” farmers → required resampling and balanced metrics. Explainability Gap: Convincing non-technical stakeholders (farmers/lenders) why a model prediction is correct was tricky, hence SHAP and visualization were critical. Deployment Tradeoffs: Some advanced models performed better but were too heavy for real-time use. I chose Logistic Regression for chatbot speed and Random Forest for dashboard predictions.
Accomplishments that we're proud of: Achieved over 91% accuracy with Logistic Regression, while keeping the model interpretable for non-technical stakeholders.
Designed a chatbot interface that allows anyone to ask natural-language questions like “Will a rural woman with tertiary education get a loan?”. Built a Power BI dashboard for lenders to visualize patterns in farmer data. Created a solution that is scalable, transparent, and could be adapted by microfinance institutions.
What we learned: The importance of feature engineering: even small variables (like phone ownership) significantly affect prediction.
Explainability is just as important as accuracy in financial ML applications. SHAP values and visualizations helped us communicate results clearly. Balancing model complexity vs usability: we learned when to choose simpler models (Logistic Regression) for speed and transparency versus complex ones for accuracy. The value of teamwork in hackathons — combining skills in data science, software engineering, and visualization.
What's next for Farmers’ Creditworthiness Prediction: Expand Data Sources: Integrate weather data, soil quality, and market prices to make predictions more robust.
Fairness & Bias Testing: Ensure the model does not unintentionally disadvantage women, youth, or rural farmers. Mobile Deployment: Develop a lightweight Android app so rural farmers can access predictions offline. Partnerships: Collaborate with microfinance institutions and NGOs to pilot test the system in real communities. Advanced NLP: Upgrade the chatbot with LLMs (e.g., LangChain + GPT models) for more natural and flexible interactions.
Built With
- decision-tree
- github
- joblib
- machine
- machine-learning
- matplotlib
- natural-language-processing
- pandas
- plotly
- plotlylogistic-regression
- power-bi
- python-(numpy
- random-forest
- regex)
- scikit-learn
- seaborn
- shap
- streamlit
- streamlit-(ui)
Log in or sign up for Devpost to join the conversation.