Machine learning-based optimal crop selection system in smart … – Nature.com

The authors divided the proposed model into two stages, i.e., phase 1 is deployed for weather prediction, whereas phase 2 is used for identifying the optimal crop. In this work, phase 1 is implemented using Recurrent Neural Networks (RNN) and in phase 2, Random Forest Classification is used for crop selection using predicted weather and soil parameters of the region. As the weather of any area for a day is also influenced by the weather conditions of the past day/s; so, RNN is observed as the most suitable ML algorithm for weather prediction in this work. In RNN, the output of a few layers act as input to the previous layer which eases the mapping of the problem of weather prediction to this algorithm. In this work, the authors used LSTM RNN due to weather dependencies on previous conditions. For appropriate crop selection, multiple weather and soil parameters are considered in this work which requires the combination of multiple decision trees to be considered, and in such cases random forest classifier can play a significant role in the problem domain.

The flowchart for the proposed work is presented in Fig.2.

Flow chart: Proposed methodology.

For this work, the authors considered the Telangana state of India, shown in Fig.3. For weather prediction, data were collected personally from the data center of the National Remote Sensing Agency (NRSA), Hyderabad for the years 20152020.

Telangana state of India: Area considered for this study.

The dataset comprised of approximately 1993 records. The features of the data were temperature, wind speed and its direction, humidity, sun hours, etc. The units of temperature, windspeed, and rainfall were Celsius, km/hour, mm respectively. In the very first step, the authors pre-processed the data to fill in the missing values, transformation, and normalization. The missing values were handled using the linear interpolation technique. Afterward, the data were processed to obtain the minimum and maximum values of the temperature for each day from the respective features. Subsequently, the authors converted the dataset to a common scale using minmax normalization30. Data preprocessing is done using pandas, scikit-learn, and minmax scalar libraries of python.

For the processed data, RNN- an advancement of neural network (NN), is used to administer sequence dependence. In RNN, the input of the current step is taken from the output of the previous step, which aids the hidden state of RNN to remember the order. But, in conventional NN, every data element is taken as an independent entity. In the proposed work, long-short-term memory (LSTM) feedback-integrated RNN is used for the deployment of the model. A single node of the LSTM mesh is shown in Fig.4.

A single node of LSTM mesh.

Every LSTM node comprises three different gates, i.e., Input (I), Output (O), and Forget (F) gate. The revised input at present time-stamp T is presented by (G.) The actual value of any of these gate's banks on the preceding hidden state ({H}_{I-1}) and present input ({X}_{T}), presented as below31,32.

$$I_{{T{ }}} = F{ }left( {{ }W_{{I{ }}} X_{{T{ }}} + { }U_{I} { }H_{T - 1} + { }B_{F} } right)$$

(1)

$$F_{{T{ }}} = F{ }left( {{ }W_{F} X_{T} + { }U_{F} { }H_{T - 1} + { }B_{F} } right)$$

(2)

$$G_{T} = tanh left( { W_{C} X_{T} + U_{C } H_{T - 1 } + B_{C } } right)$$

(3)

The revised value at the node is determined as:

$$C_{{T{ }}} = { }I_{T} .G_{T} + { }F_{{T{ }}} .C_{T - 1}$$

(4)

The gate value is derived from the node state, preceding output, and present input to the node.

$$O_{T} = F{ }left( {W_{{O{ }}} X_{T} + U_{{O{ }.{ }}} H_{T - 1} + { }V_{O} .{ }C_{T} + B_{O} } right)$$

(5)

$$H_{T} = { }O_{T} .tanh (C_{T} )$$

(6)

In this work, the authors have trained three RNN models, i.e., one for Min. Temp., one for Max. Temp., and one to predict the rainfall in the region. To predict the Max. Temp., the authors used the dataset for the years 20152018. Data for 20192020 is used to test and verify the model's accuracy. The data from the set is organized using vector as {X(1) , X(K}3. For both types of forecasting, i.e., seasonal and 90days prior, the dataset is mapped to an N X M matrix where in each row there is one input feature and 90 target values which are expressed as.

$$left{ {Xleft( T right),{ }X{ }left( {T + 1} right),{ }Xleft( {T + 2} right), ldots X{ }left( {T + 90} right)} right}$$

(7)

In the above discussion, K represents the size of the time series data. For example, k-90 represents that the dimension N and dimension M is 91 as the weather prediction is made by the LSTM, the model consists of an input layer, a hidden layer, and an output layer. The hidden layer is comprising 4 LSTM nodes. Min. Temperature and rainfall are also predicted similarly. The Pseudo code for weather prediction is shown in Fig.5. Various data features from the data are fed to the input layer. The input layer processes the data and forwards it to the middle layer. The middle layer comprises many hidden layers. Each hidden layer has its own activation function, bias, and weight. Due to the dependency of weather conditions on past data, in this work, LSTM-RNN is used.

Pseudo code for weather prediction.

The results obtained using RNN are compared with ANN and found to be more accurate.

As mentioned in Section Weather prediction, in this work, the Telangana state is considered for weather, soil, and crop data analysis. The geographical area of the state is classified into three agro-climatic belts, i.e., north Telangana, southern Telangana, and central Telangana. Different areas have different soil features. In north Telangana, the soil is red, shallow black, and profoundly calcareous. Southern Telangana comprises various textures of red soil, alluvial, and calcareous soil. Central Telangana is covered with red and calcareous soil33. The land is rated as low, medium, or highly fertile depending on its nutrient index. The main crops grown in the state are maize, rice, chilli, cotton, and soybean. Mainly these five crops are harvested over almost 5054 thousand hectares of the total agricultural land. The input to the crop selection model comprises soil and weather parameters. The various soil parameters for the algorithms are its type, pH value, water-preserving capability, and fertility. These soil and weather parameters predicted in the first phase, are collectively utilized to decide the appropriate crop for land. Before passing to the model, various categorical parameters, like type of soil, water capacity, fertility, etc. were encoded to the numerical values. The model can be utilized for both, i.e., seasonal and annual crop selection. For a season, the proposed model can recommend even more than one suitable crop and its requirements likewise water, and suggests an appropriate time to sow the crops. The example dataset utilized in the proposed model is presented in Table 2.

A total of 10 crops of the state of Telangana are considered in the proposed model. These crops are soybean, castor, green gram, sunflower, red gram, maize, chilli, cotton, jowar, and rice. But the model can be mapped for any number of crops and type of land. This work applies the random forest classification technique for reasonable crop prediction considering soil and water parameters. The random forest classifier uses a set of decision trees developed from a subset of training data. This classifier aggregates the output from each decision tree to decide the outcome. The class with the maximum number of votes is considered the outcome of the algorithm. The model is customized to check for the suitability of more than one crop for a particular land which is implemented using a threshold value. A crop is included in the list of appropriate crops if any decision tree using random forests presents the same crop as output with a value more than the threshold value.

$$Th = 2{*}left( {frac{Number;of;trees;generated;with;Random;Forest;Classifier}{{Number;of;Classes}}} right)$$

(8)

The pseudocode for crop selection is shown in Fig.6.

Pseudo code for crop selection.

The output received with sample data using the classification algorithm is presented in Table 3.

View post:

Machine learning-based optimal crop selection system in smart ... - Nature.com

Related Posts

Comments are closed.