Crop Yield Forecasting using Machine Learning Models

7 min readJul 18, 2022

PROBLEM STATEMENT/ INTRODUCTION

According to estimates, 811 million people still lack access to basic food[1], and by 2050, this list will have grown by more than 2 billion people.[2]. The United Nations’ 2030 Agenda for Sustainable Development prioritizes ending hunger and improving food security [3].

Yield evaluation, or the capacity to anticipate agricultural output consistently before the reaping of crops, poses a major challenge in addressing food safety defiance. The number of granules (seeds or grains) manufactured by a certain land tract is referred to as crop yield. It’s usually expressed as bushels per acre or kilowatt-hours per hectare. For example, the mean crop output for each acre is utilized in order to calculate a crofter’s agricultural yield on a certain field over an interval of time. Because it incorporates all farmers’ efforts and assets invested into the evolution of plants on their farms, it is sometimes viewed as the most crucial indication of a farmer’s success. Agricultural monitoring can assist improve food production and increase philanthropic missions in the front of climate change and droughts, especially in underdeveloped nations. [2]. The integration of technology to get the desired results is a challenge that the Indian agriculture sector is facing. As a result of new technologies and the use of nonrenewable energy resources, rainfall and temperature patterns have been disturbed. Farmers are finding it harder to predict temperature and rainfall patterns due to the unpredictability produced by the side effects of global warming, which has an influence on agricultural production productivity. Various machine learning algorithms, like CNN, RNN, and others, can be utilized to obtain a pattern in order to perform reliable predictions and deal with inconsistent temperature and rainfall trends. It will benefit India’s agricultural growth and, as a result, farmers’ living standards.

OBJECTIVE

We will look at a scalable, accurate, and low-cost strategy for predicting agricultural yields using publicly available remote sensing data[4]. In three ways, our approach improves on previous techniques. First, we abandon the hand-crafted features that have traditionally been employed in the remote sensing sector in favor of a method based on recent training techniques. The study also presents a new dimensionality reduction strategy that enables training a CNN or an LSTM network and automatically extracting meaningful features even when the labeled learning material is insufficient. Finally, we include a GP module to directly map the data’s spatiotemporal architecture and enhance accuracy. We test our method on county-basis soybean crop yield determination in the United States and find that it beats existing strategies. The results of these strategies are compared using root mean squared error.

METHODOLOGY:

1. CNN(Convolution Neural Network)

It is a sort of ANN utilized in DL to analyze visuals. They are based on the shared-scale design of convolution masks, which build feature maps by sliding along input features. Image/video identification, edge detection, classification, segmentation, brain-computer interfaces, and financial time series are among the uses. In comparison to other traditional neural networks, a CNN has several hidden layers that perform convolutions with a ReLU activation function.

2. LSTM(Long Short Term Memory)

It is an RNN architecture utilized in DL. LSTM has feedback connections, unlike traditional neural networks. It can control entire data patterns such as articulation or video. For example, it can be used to perceive speech, recognize unstructured cursive handwritten text, and detect anomalies in network congestion.

3. GP(Gaussian Process)

GP gains from features derived from the Gaussian distribution, making them useful in mathematical modeling. For example, the distribution of numerous acquired values can be calculated if a random process is treated as a GP. While exact models frequently scale poorly as the amount of data grows, a variety of approximation methods have been developed that often maintain acceptable accuracy while lowering computation time dramatically.

4. Problem Setting

The yield potential per unit region in a certain geographical location, such as a county or district, is what we are looking for. A sequence of multispectral photos spanning the region of interest is provided as input. Our objective is to develop a model that can translate these series of photos in their natural state into average crop yields. Plant growth-related aspects are intuitively depicted in the photographs.

5. From Raw Images to Histograms

End-to-end training of a deep model is not achievable due to a lack of labeled training data. Pre-modeling on accepted computer vision standards such as ImageNet is also inadvisable. As a result, the study developed a dimensionality reduction technique based on the permutation invariance assumption. The strategy is formed on the following assumption: we don’t anticipate the agricultural output to be heavily influenced by the position of the image pixels, which merely show farmland sites. We consider each band Ik in an image, omitting the index for symbolical ease, and differentiate the pixel values into b bins, producing a graph for each individual band. We obtain a compact description of the primitive multispectral picture by concatenating all hk. In practice, each band intensity can have a value of b = 256 and d = 9. We make an implied mean-field assumption by handling each band independently [5]. In other words, there is no information loss when converting a high-dimensional image to a histogram. In an image, there are no different pixel kinds (pixel counts), and all counts are informative.

6. From Histogram to Crop yield

While the preceding section’s histogram technique can significantly decrease the dimensionality of the input data, the intended labeling is yet very unsystematic and complicated. We use approaches from representation learning and deep models to learn essential characteristics from data instead of hand-crafting them. The usage of temporal models, such as LSTMs, is suggested by the sequential nature of the inputs. An LSTM structure that accepts vector patterns as the input is used, and a fully intertwined layer on the last LSTM cell is added to finally provide the prognosis associated with the input series. Each histogram is flattened into a vector before being sent into the network to fit the model. For the regression job, L2 loss is applied. To prevent overfitting, the network is regularized after each state transition by adding a dropout layer with a dropout rate of 0.75. Encouraged by the achievements of CNN designs on sequential data [6], the work models the non-linear mapping with a CNN architecture as well. Inputs are stacked into a 3-D histogram and supplied into the CNN as input. The convolution operation is then carried out across the “bin” and “time” dimensions. Because various places in the histogram have distinct physical meanings, the location invariance property provided by the pooling layer [7] is not required in this scenario. The pooling layer is replaced by a convolutional layer of stride equal to two, to decrease the dimensions of the intermediary filters. To facilitate gradient flow, batch normalization is employed [8], and dropout with a rate of 0.5 is used to halt overfitting after each convolutional layer.

Histogram Visualization, the LSTM structure and the CNN structure

EXPECTED RESULTS AND OUTCOMES

For county-level projections, the Root Mean Square Error (RMSE) is presented. The outcome is averaged across two runs to allow for stochastic initialization and dropout during deep model training. Each row reflects a year’s worth of predictions produced using a model based on prior years’ data. On a held-out validation set, learning rates and stopping conditions are fine-tuned (10 percent ). The results show that the CNN and LSTM approach greatly outperform competing methods. With the addition of the GP component, the models achieve significantly higher performance, with a 30% drop in RMSE over the best competing approaches. The forecasting inaccuracies of the CNN model are presented for the year 2014 to demonstrate the GP’s capacity to reduce spatially correlated errors. The correlation is diminished after the GP component is included. The study then finds that the mistakes are the result of qualities that are not visible in remote sensing photos (e.g., due to soil). The GP component recognizes these tendencies in previous learning data and efficiently rectifies them.

The end product will help the landlords and farmers by predicting the yield right before harvesting. This will let them know about the expected profit/loss so that they can harvest accordingly. The food ministry will also be able to estimate the overall yield for different crops so that they can manage the distribution of crops and do investments in the right manner.

CONCLUSION:

This article examines a study that provides a deep learning structure for predicting agricultural output using remote sensing data. It allows for year-round real-time prediction and is relevant globally, particularly in underdeveloped nations where field surveys are difficult to do. It presents a Deep GP structure that favorably separates spatially associated flaws and suggests a dimensionality reduction strategy based on histograms, which may inspire future applications in remote sensing and computational sustainability.

REFERENCES:

https://www.actionagainsthunger.org/world-hunger-facts-statistics
Dodds, F., & Bartram, J. (Eds.). (2016). The Water, Food, Energy and Climate Nexus: Challenges and an agenda for action (1st ed.). Routledge. https://doi.org/10.4324/9781315640716
https://sdgs.un.org/goals
https://github.com/gabrieltseng/pycrop-yield-prediction
Giorgio Parisi and Jonathan Machta, Statistical Field Theory, American Journal of Physics 57, 286 (1989)
https://doi.org/10.1119/1.16061
Karpathy, Andrej & Toderici, George & Shetty, Sanketh & Leung, Thomas & Sukthankar, Rahul & Fei-Fei, Li. (2014). Large-Scale Video Classification with Convolutional Neural Networks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1725–1732. 10.1109/CVPR.2014.223.
https://ieeexplore.ieee.org/document/6909619
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015). https://doi.org/10.1038/nature14539
Ioffe, Sergey & Szegedy, Christian. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.
https://doi.org/10.48550/arXiv.1502.03167

Crop Yield Forecasting using Machine Learning Models

Written by Dikshant Dwivedi