Information on crop yield at scales ranging from field to global level is imperative for farmers and decision makers. In the first place, farmers need reliable crop yield estimates. Since, knowledge on crop yield at the field level help farmers monitor yield effects of certain management choices, identify potential threats (e.g. consequences of increasing drought occurrence during the growing season) and explore potential opportunities [1]. Also for insurance purposes knowledge on average yield and yield variability is essential [1]. In the second place, crop yield data are needed for decision making and strategic planning [2], [3]. For example to set up agricultural development programs, regional crop yield data are indispensable for policy and decision makers. The current data sources to monitor crop yield, such as regional agriculture statistics, are often lacking in spatial and temporal resolution to monitor crop yield at field level. Crop yield data derived from remote sensed vegetation indices (VIs) could offer crop yield data at a higher spatial and temporal resolution. The spatial resolution of VIs can be as high as a few meters and some VIs are available at a daily scale [4]. VIs monitor particular properties of crops that can be related to final crop yield at the farm [5] and regional scale [4]. A VI that is often used to monitor crop yield is the normalized difference vegetation index (NDVI) as an indicator of the photosynthetic active biomass [6]. Several studies have shown that the information on the photosynthetic active biomass (i.e. NDVI) during the growing season or at particular stages of the crop growing season is related to crop yield [4], [6]–[8]. Remotely sensed vegetation indices (VI) such as NDVI are able to estimate crop yield using empirical modelling strategies. Empirical crop yield models are less data intensive, compared to mechanistic VI based crop models, since they relate VIs to crop yield using statistical techniques such as linear regressions or random forests.
In this research we evaluated the applicability of empirical NDVI based crop yield models for sugar beet, potato and winter wheat in northern Belgium. Crop yield data from 468 sugar beet fields, 685 potato and 666 winter wheat fields from 2016-2018 were available at the farm level from the Department of Agriculture. The platform https://openeo.org/ was used to extract 5-daily Sentinel-2 NDVI pixels (10 m resolution) within each field, apply a cloud mask based on the scene classification layer from Sentinel-2 and compute the average NDVI series for each field from the extracted NDVI pixels [9]. For each field the NDVI integral was calculated using the trapezoidal rule [4]. NDVI values between day of year (DOY) 91-273 (i.e. beginning of April-end of September), 121-273 (i.e. end of April-end of September), and 1-202 (i.e. beginning of January- end July) were considered as the length of the growing season for sugar beet, potato and winter wheat respectively. To evaluate the applicability of empirical NDVI based crop yield models for sugar beet, potato and winter wheat in northern Belgium random forest models were built. In a first model, NDVI integral was the only predictor variable to model crop yield. In a second model weather variables (i.e. monthly precipitation (P) and maximum temperature (Tmax) during the growing season) were added to the yield model. For sugar beet and potato a third model was built. In these models yield was modelled in function of NDVI integral, weather variables and the root zone soil water depletion during the growing season. Soil water depletion was modelled based on crop specific parameters, soil texture and weather data (i.e. minimum and maximum temperature, precipitation and reference evapotranspiration) for each field using Aquacrop-OS [10]. These models allowed us to evaluate if adding information of soil texture by means of including the root zone soil water depletion improved the crop yield model for sugar beet and potato. Finally, a fourth model was built for sugar beet and potato where crop yield was modelled in function of NDVI integral and the monthly P, Tmax and root zone soil water depletion during the growing season. The random forest models were based on 500 trees. The out of bag prediction error (MSE) and the corresponding explained variance (R²) were used to evaluate model performance. Variable importance of the predictor variables were calculated for the models in order to determine which predictors explained most of the crop yield variability.
We found that empirical crop models based on only the NDVI integral were not able to explain winter wheat and potato yield variability in northern Belgium. For sugar beet, the random forest model based on only the NDVI integral explains part of the sugar beet yield variability (R²=0.16). The NDVI series of winter wheat and potato were not sensitive enough to yield affecting weather and soil water conditions during important phenological stages. Winter wheat yield variability was better predicted by monthly precipitation during tillering and anthesis and NDVI integral (R²=0.66) than by NDVI integral in the period from 2016 to 2018. The model performance of the crop models based on NDVI integral and monthly root zone soil water depletion versus the models based on NDVI integral and monthly weather variables reached similar model performance for both sugar beet and potato. When both weather variables and root zone soil water depletion throughout the growing season in combination with NDVI integral were used as predictor variables model performance was not higher compared to when only weather variables or root zone soil water depletion in combination with NDVI integral were used to model sugar beet and potato crop yield. However, from the variable importance plots it is clear that Tmax and root zone soil water depletion in certain months in combination with NDVI integral explain large part of the sugar beet and potato yield variability. Maximum temperature in September, root zone soil water depletion in June and NDVI integral were found to be the most important variables to explain potato yield variability (R²=0.56). For sugar beet, soil water depletion in the month of April explains large part of the sugar beet yield variability (R²=0.84). In addition, NDVI integral and maximum temperature in September were found to be important variables to explain sugar beet yield variability. Our findings confirm the importance of meteorological variables during sensitive phenological stages [11]. We concluded that yield affecting weather and soil water conditions during important phenological stages are needed in addition to the NDVI integral to be able to model crop yield variability of sugar beet, potato and winter wheat in northern Belgium using empirical crop models.
References
[1] D. B. Lobell, D. Thau, C. Seifert, E. Engle, and B. Little, “A scalable satellite-based crop yield mapper,” Remote Sens. Environ., vol. 164, pp. 324–333, Jul. 2015, doi: 10.1016/j.rse.2015.04.021. [2] P. C. Doraiswamy, B. Akhmedov, L. Beard, A. Stern, and R. Mueller, “Operational prediction of crop yields using MODIS data and products. Workshop proceedings: Remote sensing support to crop yield forecast and area estimates, ISPRS Archives XXXVI-8/W48,” p. 5, 2007. [3] I. Becker-Reshef, E. Vermote, M. Lindeman, and C. Justice, “A generalized regression-based model for forecasting winter wheat yields in Kansas and Ukraine using MODIS data,” Remote Sens. Environ., vol. 114, no. 6, pp. 1312–1323, Jun. 2010, doi: 10.1016/j.rse.2010.01.010. [4] A. Vannoppen et al., “Wheat Yield Estimation from NDVI and Regional Climate Models in Latvia,” Remote Sens., vol. 12, p. 2206, Jul. 2020, doi: 10.3390/rs12142206. [5] A. Vannoppen and A. Gobin, “Estimating Farm Wheat Yields from NDVI and Meteorological Data,” Agronomy, vol. 11, no. 5, Art. no. 5, May 2021, doi: 10.3390/agronomy11050946. [6] Y. Ö. Durgun, A. Gobin, G. Duveiller, and B. Tychon, “A study on trade-offs between spatial resolution and temporal sampling density for wheat yield estimation using both thermal and calendar time,” Int. J. Appl. Earth Obs. Geoinformation, vol. 86, p. 101988, Apr. 2020, doi: 10.1016/j.jag.2019.101988. [7] G. Genovese, C. Vignolles, T. Nègre, and G. Passera, “A methodology for a combined use of normalised difference vegetation index and CORINE land cover data for crop yield monitoring and forecasting. A case study on Spain,” Agronomie, vol. 21, no. 1, pp. 91–111, Jan. 2001, doi: 10.1051/agro:2001111. [8] O. Rojas, “Operational maize yield model development and validation based on remote sensing and agro‐meteorological data in Kenya,” Int. J. Remote Sens., vol. 28, no. 17, pp. 3775–3793, Sep. 2007, doi: 10.1080/01431160601075608. [9] M. Schramm et al., “The openEO API–Harmonising the Use of Earth Observation Cloud Services Using Virtual Data Cube Functionalities,” Remote Sens., vol. 13, no. 6, Art. no. 6, Jan. 2021, doi: 10.3390/rs13061125. [10] T. Foster et al., “AquaCrop-OS: An open source version of FAO’s crop water productivity model,” Agric. Water Manag., vol. 181, pp. 18–22, Feb. 2017, doi:
Automatic crop yield estimation and forecasting system is of major interest for the agricultural sector. The recent availability of high spatial and temporal remote sensing Copernicus Sentinel-2 system allows a fine and precise crop monitoring at parcel level. This study presents an automatic algorithm, named ECYFS (Earth Observation based Crop Yield Forecasting System), able to ingest Harmonized Landsat Sentinel-2 (HLS) data and weather data to estimate and forecast before the end of the season the winter wheat yield at subnational units (SNU, NUTS-3 in France, NUTS-2 in Belgium, County in US). The study is based on the analysis of 15 national unit (NU, France, Belgium and 13 US states) over five cropping seasons from 2014 to 2018.
The developed system relies on a large number of Yield Features (YF) calculated for each crop parcels based on: (i) Leaf Area Index (LAI) derived from the Harmonized Landsat Sentinel-2 (HLS) data set and (ii) gridded meteorological data sets. The Canopy Structure Dynamic Model (CSDM), fitted for each wheat parcel based on the LAI time series, is used to extract basic crop development stages (see figure) and to derive YF across dynamic aggregation periods. In addition, the Simple Algorithm For Yield Estimate (SAFY) model is calibrated using LAI time series. All YF are finally averaged at SNU level and confronted with official statistics. The final statistical model relies on multi-linear step-wise regression with forward and backward Yield Features selection, combined with a double-loop cross-validation strategy.
The algorithm shows the capacity to adapt itself to various agroclimatic conditions with distinct set of selected YF across the regions. The yield model provides after-season uncertainties ranging between 12% and 43% at SNU level and up to 8% at national unit. The overall SNU uncertainty is 13% considering 8 selected US major wheat states. ECYFS failed to estimate yield on some identified regions due to lack of relevant and well-correlated YF. The forecasting mode reached 22% uncertainty as of mid-June at SNU level and lower than 15% for some NU, such as Kansas.
Over the last decade, food security has become one of the world’s greatest challenges. By 2050, the world’s population will be 34 percent higher than today, and this massive increase will mainly affect developing countries and increase the food demand. Reliable, robust and timely information on food production, agricultural practices and natural resources is required. Its precision and coverage and the related data analytics have to inform decision-making process in a rigorous and sustainable way. Finally, national sovereignty is needed to tackle the political sensitivity about food production and security.
Recognizing the challenge to reach such standards of quality, timeliness and legitimacy, key international organizations, like Food and Agriculture Organization (FAO), World Bank (WB) or World Food Program (WFP), have set-up different initiatives focusing on the agricultural data collection referring to the potential of satellite Earth Observation (EO) for agricultural statistics. The main expectations about the EO contribution to the agriculture statistics are cost-efficiency and uncertainty reduction, better granularity, timeliness improvement and provision of reliable information for sampling design.
The ESA “Sentinels for Agricultural Statistics” (Sen4Stat) project aims at facilitating the uptake of sentinel EO-derived information in the official processes of National Statistical Offices (NSOs), supporting the agricultural statistics. The project is working in four pilot countries: Spain, Ecuador, Senegal and Tanzania, thus addressing a wide diversity of both cropping systems and agricultural data collection protocols.
In close interaction with its pilot countries, the project conducted an in-depth review of how efficiently integrating EO data in their current NSOs workflow. In order to engage NSOs and facilitate a roadmap to uptake EO for statistics, national use case studies are defined:
i. Coupling crop type maps and statistical ground surveys into regression estimators to derive crop area estimates focusing on the estimate error reduction and/or on the survey cost reduction, thus far beyond pixel counting;
ii. Coupling crop type maps, biophysical variables, crop yield in situ data (collected from crop cuts or farmers interviews) and district-level official crop yield statistics into regressions and specific models to derive crop yield and production estimates;
iii. Using crop type maps, biophysical variables and statistical ground surveys to (a) disaggregate the agricultural statistics to small administrative units and (b) improve the statistics timeliness through the provision of early crop area and yield indicators;
iv. Relying on cropland and crop type maps to build or update sampling master frames and optimize the sampling design;
v. Supporting the official reporting of the SDG indicators 2.4.1. and 6.4.1 from the above-mentioned information enhanced by data analytics dashboard.
Nationwide compilation and quality control of existing agricultural survey data from the four countries was carried out, leading to reference datasets with different maturity levels. Early EO prototype products were generated over test areas, supporting the above-mentioned use cases.
In Spain, in-situ data come from the ESYRCE database, which is an integrated list and area frame survey, including square segments (700m - 250m) divided in agricultural plots. A crop type map was generated over the test sites covering around 50.000 km². Additional training data was retrieved from the Corine Land Cover map for the non-cropland samples. Different AI classification algorithms were tested (e.g. random forest, temporal or spatial convolutional neural network, transformer), as well as different strategies to select parcels for calibration paying specific attention to the minor crops. At the end, the independent validation provided an overall accuracy of 82% considering 35 crop classes and of 91% when grouping them into 12 main groups. F1 Scores of the main crop type classes were most often higher than 0.8. These maps were then coupled with the ESYRCE crop data and allowed significantly reducing the crop acreage estimates uncertainty, thus showing a high efficiency (relative efficiency of 5.2). It also enabled the spatial disaggregation of the acreage statistics with a reasonable estimation error up to the municipality-level (which is not possible using only the ESYRCE data).
In Senegal, the mapping products were generated based on the Agricultural Annual Survey (AAS), which is a list frame survey and where parcels are identified by geolocalized points. A national binary cropland vs non-cropland mask was produced using Sentinel-2 time series, with an overall accuracy of 96% and F1 Scores for cropland and non-cropland classes of 0.97 and 0.88 respectively. A crop type map including the seven main annual and permanent crop classes was generated over a test site of 80.000 km², with an overall accuracy of 85%. This was achieved thanks to calibration samples in the form of polygons instead of points, which proved to be instrumental to ensure good classification accuracy. Based on this finding, a pilot data collection protocol was designed with the NSO to move from points to polygons in the AAS and it was tested in the field in October 2021. The exploitation of this 2021 ground dataset is ongoing, with the production of new cropland and crop type maps and their use for supporting crop acreage and crop yield statistics.
In Ecuador, the in-situ data provided by the NSO are the INEC statistical database, which is an integrated list and area frame survey, with square segments of different sizes (300m - 600m - 1200m – 2400m) divided in agricultural plots. The database quality control revealed significant issues that needed to be solved before a potential use to train classification algorithms: plots without labels (especially for non-cropland areas), geolocalization issues and more importantly non-reliable crop labels. A specific use case was set-up to automatize the quality control procedure and to enable an update of the database through active learning.
In Tanzania, in-situ data were not provided by the NSO but by the Copernicus4CEOGLAM project during the second half of 2021. The prototyping in this country is ongoing, with the objective to convince the NSO about the added-value of EO data and hopefully, to be authorized to access their statistical survey for a full demonstration.
A Sen4Stat User Workshop was organized in March 2020 to present these prototype results and prepare the next steps which consist in the selection of one or two priority use cases for each country and in their demonstration at national scale during the next 18 months. This demonstration will be run on the cloud, using the Sen4Stat open-source toolbox under development based on the prototyping exercise. Hands-on trainings and capacity building activities will also be organized with each pilot country, to facilitate the uptake of this new technology in their operational workflow.
The prototype phase highlighted the requirements in terms of agricultural surveys to efficiently support and take advantage of the EO processing. Indeed, most benefits will come from the mutual adjustment between in situ sampling (quantity, representativeness and quality) and innovative EO products. Mixing EO and agricultural statistics communities is essential to ensure a deep mutual understanding of both existing statistical information systems and the EO opportunities.
The Sen4Stat activities will be used as pilot for the new "50 x 2030" initiative led by a wide group of partners including FAO and WB, which is certainly the biggest effort so far to fund agricultural data collection and reduce the agricultural data gap, especially in the low and lower-middle income countries.
Learning-based tracking of below ground asparagus carbohydrates and crop key dates estimation from fusion of freely available spaceborne SAR and optical data
Cristian Silva-Perez1, Armando Marino1, Iain Cameron2
1 University of Stirling, Scotland, UK; 2Environment systems LTD, Wales, UK.
Introduction
Peru is the largest exporter of asparagus in the world and the second-largest producer after China. This crop is key for the Peruvian agricultural sector and economy. The yield of asparagus crops is highly associated with the amount of carbohydrates stored below ground in the plant's root system. This amount defines the crop capacity to grow asparagus spears during harvest and to establish a healthy canopy when the harvest ends [1][2]. However, current methods for measuring carbohydrates in the field require expensive and destructive sampling.
Methods
In this paper, we evaluate the potential for in-season and near-real-time monitoring of the stored below-ground asparagus carbohydrates from remote sensing imagery. We propose a novel dynamic filtering framework that uses a fusion of freely available multitemporal Sentinel-1 and Sentinel-2 data to track within-season below ground carbohydrates, crop age, and forecast crop key dates. The fusion is achieved using an unscented Kalman filter (GP-UKF) together with Gaussian Processes-based dynamic and observation models. By learning the crop dynamics from historical data, the proposed method allows us to fill the gaps of sentinel-2 imagery, provide daily interpolations of the crop biophysical variables and to forecast the occurrence of crop key dates. It complements state of the art filtering frameworks [3][4] given its ability to learn the models and uncertainties from data and to exploit the temporal dimension of the remote sensing observations. This enables the method proposed here to be transferable to other crop biophysical variables, crop types and locations.
Results
We validated the proposed filtering framework using two years of field observations and found that the method can successfully predict below ground carbohydrates, retrieve season crop age, forecast harvest date, and provide uncertainties associated with each of these predictions. The following list highlights some of the main findings of this work:
a) It was also found that, although the Sentinel-2 satellite may provide higher accuracies than Sentinel-1 if several consecutive acquisitions are available, this performance drops with missing data due to clouds.
b) Results also provide evidence that the fusion of one Sentinel-1 orbit with a time series of Sentinel-2 data provide higher performance than all three Sentinel-1 orbits that captured the study site, confirming the added value of the active-passive sensor fusion.
c) It was also determined that the use of more than one Sentinel-1 acquisition geometries combined with Sentinel-2 data when available, provided the best tracking performances and a reliable system for handling missing data from any individual sensor.
Under this configuration, the method achieves a Mean Absolute Error (MAE) of 1.802 brix degrees (i.e., a surrogate for carbohydrates). Similarly, it can retrieve crop age and forecast the date when a parcel will be fit for harvest, with MAE of 6 days, in both cases. Tracking the below-ground carbohydrates as proposed here, the method provides farmers the possibility to drastically reduce the amount of destructive sampling required in this high valued crop.
Acknowledgment:
This research was conducted as part of EO4cultivar, a project led by Environment Systems Ltd and funded by the UK Space Agency International Partnership Programme. Sentinel-1 and Sentinel-2 data were provided courtesy of ESA.
References
[1] Wilson, D., Cloughley, C., Jamieson, P., & Sinton, S. (2001). A model of asparagus growth physiology. X International Asparagus Symposium 589, 297–301.
[2] Wilson, D., Sinton, S., Butler, R., Drost, D., Paschold, P.-J., van Kruistum, G., Poll, J., Garcin, C.,Pertierra, R., Vidal, I., et al. (2005). Carbohydrates and yield physiology of asparagus–a global overview.XI International Asparagus Symposium 776, 413–428
[3] De Bernardis, C., Vicente-Guijalba, F., Martinez-Marin, T., & Lopez-Sanchez, J. M. (2016). Contribution to real-time estimation of crop phenological states in a dynamical framework based on NDVI time series: Data fusion with SAR and temperature. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(8), 3512-3523.
[4] McNairn, H., Jiao, X., Pacheco, A., Sinha, A., Tan, W., & Li, Y. (2018). Estimating canola phenology using synthetic aperture radar. Remote Sensing of Environment, 219, 196-205.
Providing reliable, consistent and scalable crop yield projections is one of the major challenges in monitoring and managing food security. Accurate yield forecasts as early as possible prior to harvest are therefore critical for market stability, farm management, grain companies and governments. There have been many prior attempts at forecasting yield using various remotely-sensed and ground-data-intensive approaches. However, many cannot be applied confidently to areas other than those used for calibration of the empirical approaches. Accordingly, this study proposed a new method named the VeRsatile Crop Yield Estimator (VeRCYe), which aimed to overcome the above limitation for wheat yield estimation from the pixel, field and regional scales, by combining the advantages of both high spatio-temporal resolution remote sensing and crop model simulations. In this process, the sowing and harvest dates of each field were detected (RMSE = 2.6 days) using PlanetScope imagery. In addition, Sentinel-2 and PlanetScope data were fused into a daily 3 m surface reflectance images and LAI dataset, which enabled VeRCYe to overcome the traditional trade-off between high spatial and temporal resolutions. This study tested the method over multiple wheat fields located in the Australian wheat-belt, covering a large range of environmental conditions and farm management practices across three growing seasons (2017 - 2019). VeRCYe not only successfully estimated field-scale yield with R2 = 0.88 (RMSE of 757 kg/ha), but also produced yield maps at 3 m resolution up to four months before crop harvest. The advantages of VeRCYe are that (1) it can be used to estimate yield without the need for ground calibration, (2) it can in principle be applied to other crop types, and (3) it can be used with any remotely sensed LAI. Furthermore, VeRCYe can help to identify yield gaps, and understand yield variability together with its causes from the pixel- to regional-level.
INTEGRATING SENTINEL-1 AND SENTINEL-2 TO MONITOR WINTER BARLEY WITHIN-FIELD YIELD
B. Mollà-Bononad¹, B. Franch¹ ², A. San Bautista³, C. Rubio⁴, D. Fita³, P. Ariza³
¹ Global Change Unit, Image Processing Laboratory, Univeristat de València, Paterna (València) 46980, Spain
² Department of Geographical Sciences, University of Maryland, College Park MD 20742, USA
³ Departamento de Producción Vegetal, Universitat Politécnica de València (Valencia), 46022, Spain
⁴ Centro de Tecnologías Físicas, Universitat Politécnica de València (Valencia), 46022, Spain
According to United Nations’ Department of Economic and Social Affairs, by the end of this decade, global population will have risen to 8 plus thousands of million people. This manifests the necessity of increasing food production in order to feed our rising population. The previous, in the context of a changing climate and energy vector scarcity, highlights the importance of ensuring and monitoring essential resources such as crops (and specifically winter cereals) to maintain and improve food production.
In this work we monitor within-field barley yield based on Earth Observation (EO) data from Sentinel-2 and Sentinel-1. Both satellites’ data has been preprocessed granting Sentinel-2 BOA surface reflectance and Sentinel-1 γ0 backscatter coefficient. Yield maps data has been measured by harvest machines during 2020 and 2021 seasons, providing dry yield roughly every 7m, over irregular polygons. Thus, field data has also been preprocessed to achieve spatial consistency and reduce measuring software errors. Training and validation has followed a structure of k-fold cross validation. The main objective of this work is exploring the integration of C-band SAR data (center frequency of 5.405 GHz) to monitor barley yield. On one hand, Sentinel-1 data can increase Sentinel-2 temporal resolution and assure that even in cloudy conditions information on crop development can be retrieved. But on the other side, in cloud free conditions each spectral band and polarization is analyzed to define the combination that is best correlated with the final yield maps. Therefore, linear regression (ordinary and regularized) and machine learning (random forest) algorithms were tested using different spectral bands and polarizations from both Sentinel satellites. An optimal date for the final model has been selected attending to performance metrics such as r2 coefficient of determination, Root Mean Square Error (RMSE) and Relative RMSE (RRMSE); creating predicted within-field yield maps and tracking their uncertainties.