Increasing crop yield based on soil type using machine learning algorithms
Create New

Increasing crop yield based on soil type using machine learning algorithms

Project period

01/04/2020 - 02/02/2020

Views

162

4



Increasing crop yield based on soil type using machine learning algorithms
Increasing crop yield based on soil type using machine learning algorithms

The growth of a particular food crop in a particular soil depends on the quality and the nutrient contents present in the soil. Type of plants to be grown in a soil is predicted based on the soil type and nutrients using Machine Learning tools. The farming soil may be rich in any of the elements like Sulphur, Calcium, Nitrogen, Phosphorus, Potassium or Magnesium. Based on the nutrient content in the soil, a particular plant or crop will be grown in that particular soil. The nutrient contents in the soil is predicted by Data Science and Machine Learning algorithms. 

Machine Learning tools are a boon to predict a particular crop plant to be grown in a particular soil. Knowing the nutrients in a particular soil and classification of the soil based on this is essential, as this will lead to high yield of a particular crop grown in a particular soil. Analysis of soil type and soil nutrients using Machine learning helps in predicting the right crop to be grown in the right soil. 

Crop yields are critically dependent on soil type. A growing empirical literature models this relationship in order to project soil nutrients impacts on the sector. We describe an approach to yield modeling that uses a semiparametric variant of a deep neural network, which can simultaneously account for complex nonlinear relationships in high-dimensional datasets, as well as known parametric structure and unobserved cross-sectional heterogeneity. 

Soil type will affect the agricultural sector more directly than many others because of its direct dependence on weather. The nature and magnitude of these impacts depends both on the evolution of the climate system, as well as the relationship between crop yields and soil type. This project focuses on the yield prediction from soil type. Accurate models mapping soil type to crop yields are important not only for projecting impacts to agriculture, but also for projecting the impact of climate change on linked economic and environmental outcomes, and in turn for mitigation and adaptation policy.

In parallel, machine learning (ML) techniques have advanced considerably over the past several decades. ML is philosophically distinct from much of classical statistics, largely because its goals are different—it is largely focused on prediction of outcomes, as opposed to inference into the nature of the mechanistic processes generating those outcomes. (We focus on supervised ML—used for prediction—rather than unsupervised ML, which is used to discover structure in unlabeled data.)

Why: Problem statement

Food crops are not being grown properly due to wrong nutrient and soil type. This leads to poor crop yield due poor soil quality.

How: Solution description

We used machine learning algorithms to solve the above problem. 

DATA COLLECTION:

       Data was collected from various websites. Features included are nutrients like NITROGEN (N), PHOSPHOROUS (P), POTASSIUM (K), CALCIUM (Ca), SULFUR (S), MAGNESIUM (Mg), etc. for plant and soil types like ALLUVIAL SOIL, BLACK SOIL, RED AND YELLOW SOIL, LATERITE SOIL, ARID SOIL, FOREST AND MOUNTAIN SOIL, DESERT SOIL, etc. Target is the plant type.

 

Steps in the algorithms:

  • Decision tree

Decision Tree algorithm belongs to the family of supervised learning algorithms. Unlike other supervised learning algorithms, the decision tree algorithm can be used for solving regression and classification problems too. The goal of using a Decision Tree is to create a training model that can use to predict the class or value of target variables by learning simple decision rules inferred from prior data(training data). In Decision Trees, for predicting a class label for a record we start from the root of the tree. We compare the values of the root attribute with the record’s attribute. On the basis of comparison, we follow the branch corresponding to that value and jump to the next node.

  • Random forest

Random forest is a supervised learning algorithm which is used for both classification as well as regression. But however, it is mainly used for classification problems. As we know that a forest is made up of trees and more trees means more robust forest. Similarly, random forest algorithm creates decision trees on data samples and then gets the prediction from each of them and finally selects the best solution by means of voting. It is an ensemble method which is better than a single decision tree because it reduces the over-fitting by averaging the result.

  • Naive Bayes

It is a classification technique based on Bayes’ Theorem with an assumption of independence among predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. Naive Bayes model is easy to build and particularly useful for very large data sets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods.

Among the three algorithms used, decision tree algorithm yielded the largest precision of 95%.

How is it different from competition

Comparing other websites (eg: hydroponic_nutrient solution) suggests only the nutrients for plant or only soil types (eg: toppr_guides.com). But this project suggests nutrients as well as soil type. In the future fertilizers may also be added.

Who are your customers

  1. Farmers
  2. Hobbiest in farming 
  3. Agriculture Students

Project Phases and Schedule

Phase 1: Data Collection Process

Phase 2 : Algorithm Development

Phase 3: User interface development 

Resources Required

  • Data obtained from kaggle website [crop nutrient database by chris crawford]
  • Research papers [Van keulen: crop yield and nutrients]
  • Google data
  • Anaconda Tool
  • Python 3.7

Download:
Project Code Code copy
/* Your file Name : Prediction.ipynb */
/* Your coding Language : python */
/* Your code snippet start here */
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tkinter import *\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "# from gui_stuff import *\n",
    "\n",
    "l1=['NITROGEN(N)','PHOSPHOROUS(P)','POTASSIUM(K)','CALCIUM(Ca)','SULFUR(S)','MAGNESIUM(Mg)','CARBON(C)','HYDROGEN(H)',\n",
    "    'IRON(Fe)','BORON(B)','CHLORINE(Cl)','MANGANESE(Mn)','ZINC(Zn)','COPPER(Cu)','MOLYBDENUM(Mo)','NICKEL(Ni)','SILICON(Si)',\n",
    "    'SODIUM(Na)','COBALT(Cb)','ALLUVIAL SOIL','BLACK SOIL','RED AND YELLOW SOIL','LATERITE SOIL','ARID SOIL','FOREST AND MOUNTAIN SOIL',\n",
    "    'DESERT SOIL','MARSHY SOIL  AND PEATY SOIL','WATER REQUIREMENT']\n",
    "PLANT=['spinach','TOMATO','DRUM STICK','POTATO','LADIES FINGER','ONION','BRINJAL','BITTER GOURD']\n",
    "\n",
    "l2=[]\n",
    "for x in range(0,len(l1)):\n",
    "    l2.append(0)\n",
    "\n",
    "# TESTING DATA df -------------------------------------------------------------------------------------\n",
    "df=pd.read_csv(\"Training.csv\")\n",
    "\n",
    "df.replace({'PLANT':{'spinach':0,'TOMATO':1,'DRUM STICK':2,\n",
    "                     'POTATO':4,'LADIES FINGER':5,'ONION':6,\n",
    "                     'BRINJAL':7,'BITTER GOURD':8,'BITTER GOURD':9}},inplace=True)\n",
    "\n",
    "# print(df.head())\n",
    "\n",
    "X= df[l1]\n",
    "\n",
    "y = df[[\"PLANT\"]]\n",
    "np.ravel(y)\n",
    "# print(y)\n",
    "\n",
    "# TRAINING DATA tr --------------------------------------------------------------------------------\n",
    "tr=pd.read_csv(\"Testing.csv\")\n",
    "tr.replace({'plant':{'spinach':0,'TOMATO':1,'DRUM STICK':2,\n",
    "                     'POTATO':4,'LADIES FINGER':5,'ONION':6,\n",
    "                     'BRINJAL':7,'BITTER GOURD':8,'BITTER GOURD':9}},inplace=True)\n",
    "X_test= tr[l1]\n",
    "y_test = tr[[\"PLANT\"]]\n",
    "np.ravel(y_test)\n",
    "# ------------------------------------------------------------------------------------------------------\n",
    "\n",
    "def DecisionTree():\n",
    "\n",
    "    from sklearn import tree\n",
    "\n",
    "    clf3 = tree.DecisionTreeClassifier()   # empty model of the decision tree\n",
    "    clf3 = clf3.fit(X,y)\n",
    "\n",
    "    # calculating accuracy-------------------------------------------------------------------\n",
    "    from sklearn.metrics import accuracy_score\n",
    "    y_pred=clf3.predict(X_test)\n",
    "    print(accuracy_score(y_test, y_pred))\n",
    "    print(accuracy_score(y_test, y_pred,normalize=False))\n",
    "    # -----------------------------------------------------\n",
    "\n",
    "    pnutrition = [nutrition1.get(),nutrition2.get(),nutrition3.get(),soiltype()]\n",
    "\n",
    "    for k in range(0,len(l1)):\n",
    "        # print (k,)\n",
    "        for z in pnutrition:\n",
    "            if(z==l1[k]):\n",
    "                l2[k]=1\n",
    "\n",
    "    inputtest = [l2]\n",
    "    predict = clf3.predict(inputtest)\n",
    "    predicted=predict[0]\n",
    "\n",
    "    h='no'\n",
    "    for a in range(0,len(PLANT)):\n",
    "        if(predicted == a):\n",
    "            h='yes'\n",
    "            break\n",
    "\n",
    "\n",
    "    if (h=='yes'):\n",
    "        t1.delete(\"1.0\", END)\n",
    "        t1.insert(END, PLANT[a])\n",
    "    else:\n",
    "        t1.delete(\"1.0\", END)\n",
    "        t1.insert(END, \"Not Found\")\n",
    "\n",
    "\n",
    "def randomforest():\n",
    "    from sklearn.ensemble import RandomForestClassifier\n",
    "    clf4 = RandomForestClassifier()\n",
    "    clf4 = clf4.fit(X,np.ravel(y))\n",
    "\n",
    "    # calculating accuracy-------------------------------------------------------------------\n",
    "    from sklearn.metrics import accuracy_score\n",
    "    y_pred=clf4.predict(X_test)\n",
    "    print(accuracy_score(y_test, y_pred))\n",
    "    print(accuracy_score(y_test, y_pred,normalize=False))\n",
    "    # -----------------------------------------------------\n",
    "\n",
    "    pnutrition = [nutrition1.get(),nutrition2.get(),nutrition3.get(),soiltype()]\n",
    "\n",
    "    for k in range(0,len(l1)):\n",
    "        for z in pnutrition:\n",
    "            if(z==l1[k]):\n",
    "                l2[k]=1\n",
    "\n",
    "    inputtest = [l2]\n",
    "    predict = clf4.predict(inputtest)\n",
    "    predicted=predict[0]\n",
    "\n",
    "    h='no'\n",
    "    for a in range(0,len(PLANT)):\n",
    "        if(predicted == a):\n",
    "            h='yes'\n",
    "            break\n",
    "\n",
    "    if (h=='yes'):\n",
    "        t2.delete(\"1.0\", END)\n",
    "        t2.insert(END, PLANT[a])\n",
    "    else:\n",
    "        t2.delete(\"1.0\", END)\n",
    "        t2.insert(END, \"Not Found\")\n",
    "\n",
    "\n",
    "def NaiveBayes():\n",
    "    from sklearn.naive_bayes import GaussianNB\n",
    "    gnb = GaussianNB()\n",
    "    gnb=gnb.fit(X,np.ravel(y))\n",
    "\n",
    "    # calculating accuracy-------------------------------------------------------------------\n",
    "    from sklearn.metrics import accuracy_score\n",
    "    y_pred=gnb.predict(X_test)\n",
    "    print(accuracy_score(y_test, y_pred))\n",
    "    print(accuracy_score(y_test, y_pred,normalize=False))\n",
    "    # -----------------------------------------------------\n",
    "\n",
    "    pnutrition = [nutrition1.get(),nutrition2.get(),nutrition3.get(),soiltype()]\n",
    "    for k in range(0,len(l1)):\n",
    "        for z in pnutrition:\n",
    "            if(z==l1[k]):\n",
    "                l2[k]=1\n",
    "\n",
    "    inputtest = [l2]\n",
    "    predict = gnb.predict(inputtest)\n",
    "    predicted=predict[0]\n",
    "\n",
    "    h='no'\n",
    "    for a in range(0,len(PLANT)):\n",
    "        if(predicted == a):\n",
    "            h='yes'\n",
    "            break\n",
    "\n",
    "    if (h=='yes'):\n",
    "        t3.delete(\"1.0\", END)\n",
    "        t3.insert(END, PLANT[a])\n",
    "    else:\n",
    "        t3.delete(\"1.0\", END)\n",
    "        t3.insert(END, \"Not Found\")\n",
    "\n",
    "# gui_stuff------------------------------------------------------------------------------------\n",
    "\n",
    "root = Tk()\n",
    "root.configure(background='orange')\n",
    "\n",
    "# entry variables\n",
    "nutrition1 = StringVar()\n",
    "nutrition1.set(None)\n",
    "nutrition2 = StringVar()\n",
    "nutrition2.set(None)\n",
    "nutrition3 = StringVar()\n",
    "nutrition3.set(None)\n",
    "soiltype = StringVar()\n",
    "soiltype.set(None)\n",
    "Name = StringVar()\n",
    "\n",
    "# Heading\n",
    "w2 = Label(root, justify=LEFT, text=\"soil  fertitlity Predictor using deep Learning\", fg=\"white\", bg=\"orange\")\n",
    "w2.config(font=(\"Elephant\", 30))\n",
    "w2.grid(row=1, column=0, columnspan=2, padx=100)\n",
    "\n",
    "# labels\n",
    "\n",
    "S1Lb = Label(root, text=\"nutrition 1\", fg=\"yellow\", bg=\"black\")\n",
    "S1Lb.grid(row=7, column=0, pady=10, sticky=W)\n",
    "\n",
    "S2Lb = Label(root, text=\"nutrition 2\", fg=\"yellow\", bg=\"black\")\n",
    "S2Lb.grid(row=8, column=0, pady=10, sticky=W)\n",
    "\n",
    "S3Lb = Label(root, text=\"nutrition 3\", fg=\"yellow\", bg=\"black\")\n",
    "S3Lb.grid(row=9, column=0, pady=10, sticky=W)\n",
    "\n",
    "S4Lb = Label(root, text=\"soiltype\", fg=\"yellow\", bg=\"black\")\n",
    "S4Lb.grid(row=10, column=0, pady=10, sticky=W)\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "lrLb = Label(root, text=\"DecisionTree\", fg=\"white\", bg=\"red\")\n",
    "lrLb.grid(row=15, column=0, pady=10,sticky=W)\n",
    "\n",
    "destreeLb = Label(root, text=\"RandomForest\", fg=\"white\", bg=\"red\")\n",
    "destreeLb.grid(row=17, column=0, pady=10, sticky=W)\n",
    "\n",
    "ranfLb = Label(root, text=\"NaiveBayes\", fg=\"white\", bg=\"red\")\n",
    "ranfLb.grid(row=19, column=0, pady=10, sticky=W)\n",
    "\n",
    "# entries\n",
    "OPTIONS = sorted(l1)\n",
    "\n",
    "S1En = OptionMenu(root, nutrition1,*OPTIONS)\n",
    "S1En.grid(row=7, column=1)\n",
    "\n",
    "S2En = OptionMenu(root, nutrition2,*OPTIONS)\n",
    "S2En.grid(row=8, column=1)\n",
    "\n",
    "S3En = OptionMenu(root, nutrition3,*OPTIONS)\n",
    "S3En.grid(row=9, column=1)\n",
    "\n",
    "S4En = OptionMenu(root, soiltype,*OPTIONS)\n",
    "S4En.grid(row=10, column=1)\n",
    "\n",
    "\n",
    "dst = Button(root, text=\"DecisionTree\", command=DecisionTree,bg=\"green\",fg=\"red\")\n",
    "dst.grid(row=8, column=3,padx=10)\n",
    "\n",
    "rnf = Button(root, text=\"Randomforest\", command=randomforest,bg=\"green\",fg=\"red\")\n",
    "rnf.grid(row=9, column=3,padx=10)\n",
    "\n",
    "lr = Button(root, text=\"NaiveBayes\", command=NaiveBayes,bg=\"green\",fg=\"red\")\n",
    "lr.grid(row=10, column=3,padx=10)\n",
    "\n",
    "#textfileds\n",
    "t1 = Text(root, height=1, width=40,bg=\"white\",fg=\"black\")\n",
    "t1.grid(row=15, column=1, padx=10)\n",
    "\n",
    "t2 = Text(root, height=1, width=40,bg=\"white\",fg=\"black\")\n",
    "t2.grid(row=17, column=1 , padx=10)\n",
    "\n",
    "t3 = Text(root, height=1, width=40,bg=\"white\",fg=\"black\")\n",
    "t3.grid(row=19, column=1 , padx=10)\n",
    "\n",
    "root.mainloop()\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

Comments

Leave a Comment

Post a Comment