A bad rainfall prediction can affect the agriculture mostly framers as their whole crop is depend on the rainfall and agriculture is always an important part of every economy. So, making an accurate prediction of the rainfall somewhat good. There are number of techniques are used of machine learning but accuracy is always a matter of concern in prediction made in rainfall. There are number of causes made by rainfall affecting the world ex. Drought, Flood and intense summer heat etc. And it will also affect water resources around the world. Our major concern is the major downfall to the rainfall

In our experimental study we use the rainfall data collected from the official website of Indian government. The data collected is comprises more than a decade of measurement of rainfall in all over India. As the world if moving toward to the issue of water and in India specific the rainfall prediction is most important thing. So, in this paper we try to optimize the result and to find the model which is well suitable for the rainfall prediction in India specific region only.

In our experimental study we have a tendency to use the rain knowledge collected from the official web site of Indian government. the info collected is includes quite a decade of mensuration of rain all told over Asian nation. because the world if moving toward to the difficulty of water and in Asian nation specific the rain prediction is most vital factor. So, during this paper we have a tendency to try and optimize the result and to seek out the model that is well appropriate for the rain prediction in Asian nation specific region solely.

Data preprocessing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviors or trends, and is likely to containment errors. We have carried below preprocessing steps.

We learned in our EDA step that our data set is highly imbalanced. Imbalanced data results in biased results as our model doesn’t learn much about the minority class. We performed two experiments one with oversampled data and another with under-sampled data.

we're proud of the accuracy rates we've achieved - Algorithms / Training (%) / Testing (%)

Linear Regression / 41.699999999999996 / 33.1 Lasso Regression / 26.1 / 25.6 Ridge Model / 41.6999999999999996 / 33.3000000000000004 SVM Model / 3.5000000000000004 / 1.7000000000000002 Random Forest Model / 72.7 / 42.1

we explored and applied several preprocessing steps and learned there impact on the overall performance of our classifiers. We also carried a comparative study of all the classifiers with different input data and observed how the input data can affect the model predictions.

Built With

Share this project:

Updates