As we get further away from the SDG 2 target of Zero Hunger, food security remains one of the most pressing issues we face, especially in the context of increasing extreme weather events under a warming climate. As such, innovation in developing robust and scalable measures to monitor the world’s crops in a timely, transparent manner is a key component in helping to address this global challenge. With recent major advances in Earth observing (EO) satellites, cloud compute, GPS technologies, and machine learning/artificial intelligence, we currently have the data and tools needed to monitor and track nearly every field across the globe on a near daily basis. COVID-19 continues to touch nearly every aspect of our daily lives, and recent droughts, floods, supply chain issues, and conflict have devastated livelihoods and impacted agricultural production, leading to an unexpected relevance and urgency regarding the need for improved agricultural information, and serving to further highlight information gaps that satellite data can help fill. Understanding production prospects in near real time has never been more important in order to direct and prioritize early warning & proactive food security response and support well-functioning agricultural markets.
In 2016 in an effort to help address these challenges, NASA’s Applied Sciences Program called for a new concerted effort and, for the first time, openly competed for a program on agriculture and food security. The NASA Harvest Consortium, led by the University of Maryland, was selected, and in November 2017 became NASA’s official program on Agriculture and Food Security. It is a stakeholder-driven program, motivated by the fact that more timely and accurate agricultural information, as enabled by EO data and advancing technologies, can significantly enhance key agricultural decisions, whether by humanitarian organizations, governments, insurance companies, or farmers. It is run as a multi-sectoral Consortium aimed at enabling and advancing the awareness, use, and adoption of satellite Earth observations by public & private organizations to sustainably benefit food security and agricultural resilience in the US and worldwide. The NASA Harvest Consortium is comprised of more than 50 members spanning across the public, private, non-government and government, intergovernmental organizations, and the humanitarian sectors alike. The Consortium is led by researchers at the University of Maryland, which provides a hub with distribution partners and activities. Harvest is also NASA’s contribution to the international GEOGLAM program, mandated by the G20 in 2011 to increase market transparency and improve food security, and builds on the partnerships and work established through GEOGLAM. The consortium model has the advantage of focusing multiple institutions and partner organizations on specific problems and tasks, with more agility than individual conventional research proposals.
NASA Harvest works at global, regional, national, and field levels in agricultural systems that range from subsistence to large-scale commodity production. The program has three impact areas: agricultural land use, sustainability, and productivity. It aims to improve these three areas by advancing the quality, availability and timeliness of EO-based products and methods in crop land and crop type mapping, crop condition monitoring, crop statistics generation, crop yield forecasting and estimation, and cropping practices characterization. To accomplish this, its program of activities is designed to advance the state of the science and the state of use through innovation in field data collection and sharing, public-private partnerships, open trans-disciplinary data platforms, data integration, data science and capacity development. This talk will provide an overview of the NASA Harvest program and will highlight examples of its work and impact across the agricultural markets, humanitarian and private sector domains.
1. Introduction and context
Under challenging and changing climatic conditions, food security is under pressure in a number of countries across the world and in particular in Africa. Local food production can be reduced by erratic weather conditions like drought and flooding but also by other threats like locust swarms. Crop type mapping and crop area estimates represent basic information of primary importance for crop monitoring. In addition, providing high quality crop maps and area estimates can enhance food security since stakeholders can act on this valuable information. The use of remote sensing techniques is ideal for rapid and affordable crop type mapping since large areas are monitored with a spatial resolution and a spectral detail that are constantly improving.. However, the production of reliable crop type maps requires expensive field campaign operations to collect ground truth, specific remote sensing knowledge and processing capacity for the timely production of the information. The use of such approaches has rapidly increased in recent years thanks to the availability of dense Sentinel 1 & 2 time series and in situ data. However, there is still large potential in developing their operational use to derive crop type area statistics thanks to larger and more systematic in situ data collection to train machine learning algorithms. Agricultural monitoring systems in developing countries in particular would benefit from support to full access and exploitation of innovative data and methods and the GEOGLAM network is one way to coordinate these efforts.
Therefore, the main objective of the Copernicus service Copernicu4GEOGLAM is to strengthen the EU support to the GEOGLAM initiative in developing countries and respond to requests from countries to provide ad-hoc baseline crop monitoring information including crop-type mapping and area estimates during and at the end of the growing season.
During this first year of activity, the service targeted three Areas of Interest (AoI) of about 100,000 km² each in Kenya, Tanzania and Uganda. Results of the mapping service for the first growing season processed are presented in this paper.
2. Field campaign
The objective of the field campaign is twofold: to provide training and validation data for the satellite image thematic classification, and to produce accurate and unbiased area estimates for the most important crops grown in the selected AoIs.
Therefore, a probabilistic sampling was applied based on a stratified systematic random approach to ensure that collected data could already be used directly to produce unbiased crop area estimates. On average 300 to 400 Primary Sample Units (PSUs) were selected for each AoI. These PSUs were visually interpreted based on available VHR imagery from virtual globes and latest Sentinel 2 imagery from the current growing season to delineate field parcel boundaries and non-cropped land use. PSUs were then surveyed in the field by a team of enumerators using a smartphone app to collect and upload the data on a daily basis to a central server. The data collected was checked daily for any missing or erroneous information so that immediate mitigating action could be taken if necessary.
More than 10,000 crop observations were collected for the selected PSUs and more than half of these sample units were covered with a mixed cropping pattern.
3. Satellite imagery classification
The data collected in the field was post-processed and split randomly in a training (75% of the data) and validation (25%) dataset. For training, field boundaries were eroded and smallest field parcels discarded to avoid the inclusion of mixed pixels.
On average more than a 1000 Sentinel 2 scenes and around 300 Sentinel 1 scenes were processed for each AoI to produce monthly synthesis that were used as input to the classification process. A 45-day integration period was used to create monthly synthesis of Sentinel 2 imagery based on interpolated values for the duration of the growing season. Random Forest and TempCNN algorithms were tested, but no substantial improvements were found so far by using a deep learning approach.
In-season crop mask and crop type maps were produced about one month after the completion of the field campaign (so-called in-season mapping) and end-of season maps were produced one month after the end of the growing season. 35 crop types and 10 non-crop land covers were registered. Crop types were regrouped according to the main crop types resulting in a thematic map of 9 to 10 crop type classes.
Independent observations from the field campaign were used to assess the accuracy of the maps produced. The overall accuracies of the crop mask range from 84 to 87% and the crop type maps from 8O to 81% for the end-of-season products with an improvement of 1 to 8% from the in-season products. Some of the main crops’ accuracies are satisfactory with F1 score around 0.6-0.7 in some cases, but some of the crops still exhibits low accuracies mainly due to the very small parcel size (e.g. in Uganda) and mixed cropping patterns.
4. Crop area estimates
Crop area estimates can be derived directly from the field data alone using the so-called direct expansion method as long as the data has been collected based on a probabilistic sample or for which a suitable method can be used to correct any potential bias. Therefore, early area estimates (direct expansion estimators) can be provided as soon as the results from the field campaign have been collated and analysed even before the classification of the satellite imagery.
Thanks to the probabilistic sampling approach, the estimate of proportion (y) of class (c) and its variance can be calculated for each stratum. The total estimate just corresponds to the weighted average of the proportions according to the area covered by each stratum. The standard error for the whole area is then the square root of the sum of the variance times the square of the area for each stratum. However, the confidence interval of the direct expansion estimators is likely to be relatively large. To improve the precision of the estimates, field segment data can be combined with classified satellite imagery. In this latter case (i.e., using the classification map), a so-called regression estimator can be applied and its variance calculated. The estimation of land cover type areas obtained by such procedure can be very variable from pixel counts because image classification is affected by misclassification errors affecting the classes. Area estimates derived from the regression estimator method are corrected from misclassification errors whilst exhibiting a more precise estimate than that of the direct expansion estimate thanks to the complete coverage provided by the image classification.
In summary, direct expansion estimates are unbiased, but suffer from high sampling error, pixel counts from classified satellite imagery are biased but have no sampling errors and the combination of ground data and classified imagery are unbiased and exhibit a reduced sampling error. The efficiency of the regression estimator is estimated by the relative efficiency, which is the ratio of the variance from the regression estimator method and the direct expansion estimate.
Cropland represented between 25 to 30% of the AoIs and the dominant crop was maize in all 3 AoIs ranging from just under 500,000 ha in the AoI in Uganda to over 1.1 million ha in Kenya and Tanzania with a 95% confidence interval of 75,000 ha achieved with the regression estimate in Uganda and 115,000 and 155,000 ha in Kenya and Tanzania, respectively. A relative efficiency equal or greater than 2 was achieved for Maize in all 3 AoIs, meaning that to achieve the same level of uncertainty without the crop type map, twice as many PSUs would have been required. Similar or even better results were obtained for other crops.
5. Conclusions
This study shows that despite the COVID19 crisis it was possible to collect detailed field data following a strict probabilistic sampling approach in Africa and that the data could be used to train the classification of Sentinel imagery and produce reliable crop mask and crop type maps. Accurate crop area estimates could be produced before harvest with already good level of accuracy. The combination of field data with the maps can reduce the amount of field data to be collected to achieve precise crop are estimates and crop type maps can provide useful information on the areas where crops are grown. However, the accuracy of some of the main crop types can be relatively low mainly due to (i) small crop parcels, (ii) mixed cropping patterns and (iii) heterogenous crop stage development across the AoI. In addition, even though the Sentinel 2 synthesis approach appears to be effective, further improvements may be achieved finding synergies with Sentinel 1 and by integrating higher resolution imagery. The results described here can already be visualized in the Copernicus Global Land Hotspots explorer. Both the in situ data collected and the results produced by the presented approach will be made available in a fully free and open way.
Following the global food price hikes in 2007/08 and 2010/11, as part of the Action Plan on Food Price Volatility and Agriculture, the G20 Heads of States endorsed in their 2011 Declaration both the Group on Earth Observations Global Agricultural Monitoring (GEOGLAM) and the Agricultural Market Information System (AMIS), to commit to improve market information and transparency in order to make international markets for agricultural commodities more effective. To that end, the “Agricultural Market Information System” (AMIS) was launched, to improve information on markets, and the “Global Agricultural Geo-monitoring Initiative” (GEOGLAM) was created to coordinate satellite monitoring observation systems in different regions of the world in order to enhance crop production projections and weather forecasting data. After the success of the engagement of the different countries from the G20, especially in the Americas, in 2018 AMA was formally launched. The main goals are to address the gaps in the region related to the use of remote sensing technologies in agriculture. operational work. The AMA is led and coordinated by the GEOGLAM Secretariat with NASA Applied Sciences support. For the participating countries, participation is voluntary and on a best-efforts basis and all contributions are considered in-kind. Most of the methods used by AMA revolve around establishing processes and guidance toward developing capacities to “translate” the science into actionable information that is readily interpretable by a non-technical decision-making audience. AMA works to make available sufficient EO data for member usage. On a regular basis, AMA holds working group teleconferences and also coordinates and organizes regional meetings and training events related to increasing EO usage for agricultural monitoring. The objective of the presentation is to talk about the history of AMA, the current situation, and the challenges ahead in the region.
Meteorological services in developing countries hold significant volumes of station data, the most frequently available being rainfall, with air temperature (maximum and minimum) usually present in smaller numbers. On the other hand, satellite derived rainfall estimates and other satellite indicators on vegetation and land surface temperature are ever more widely available. A few of these estimates (e.g. CHIRPS) already incorporate commonly available raingauge data.
Regular reporting on the evolution of the rainfall season by national meteorological services mostly rely on station data, whether in the form of tables and/or interpolated station data and their anomalies. Usage of satellite derived rainfall information is less common, frequently relying on ready-made products from suppliers such as Fews-net.
An optimum solution for National Meteorological Services is to integrate station data with satellite data to derive blended products that maximize the information available to these institutions and which perform better than any of the individual components in isolation. In the case of rainfall, blending of station data corrects conditional biases in satellite rainfall estimates improving representation of rainfall fields. We present here results for a number of countries showing improved performance from the blended estimates. Similar approaches can be applied for air temperature, using land surface temperature as a background interpolator.
The World Food Program supports National Meteorological Services by strengthening their capacity for early warning and seasonal monitoring. NMS are enabled to access WFP cloud based system for processing of near global EO data (rainfall estimates, MODIS NDVI and LST, snow cover), upload their station data and download blended rainfall products (amounts at a variety of timescales and respective anomalies), as well as blended air temperature data (with MODIS LST), NDVI and non-blended rainfall products (SPI, dry spells, number of rain days).
The availability of these products enables the production of regular improved reports with better identification of drought episodes and extreme rainfall events. A well featured web-platform (PRISM) designed by WFP can be deployed for display and analysis of the satellite and blended products and allows the integration of hazard indicators with socio-economic and food security data.
WFP has also funded data recovery initiatives, that led to the mobilization of 40 years of rainfall and temperature records. Application of the blending algorithms to these data sets leads to the preparation of national databases of gridded rainfall of very high quality, which can be used as reference data in early warning activities, for extensive climatological analysis of rainfall and temperature patterns and for assessment of sectoral climate risks.
Examples from various country systems (Mozambique, Namibia, Zimbabwe, Cuba, Sri Lanka) are presented, illustrating the benefits of well-coordinated collaboration with Meteorological Offices for high quality early warning and seasonal monitoring.
Crop yield forecasting is essential to ensure food security at national and international level. Studies conducted in different parts of the world underline that effective crop yield forecasting requires a thorough understanding of the factors that explain interannual variability of crop yield at regional scale. In large geographical areas such as the European Union (EU), the importance of these factors is likely to differ between regions, due to regional variation in growing conditions. However, this is still poorly understood in the case of forecasting wheat yield, despite the EU’s importance for global wheat production. Therefore, the objective of this study was to assess which environmental variables, derived from satellite and meteorological data, are the main factors that explain interannual variability of wheat yield at regional scale within the EU, and whether the relative importance of these factors differs between EU regions. In addition to differences between regions, we investigated whether the relative importance of these factors differed between months of the growing season.
For reference data, we used regional time series of soft and durum wheat yields, obtained from the national statistical institutes of the EU Member States. Meteorological data and crop biomass indicators were used as explanatory variables. Meteorological data, such as average daily temperature and daily rainfall, were obtained from the JRC-MARS weather database, which provides daily data from station observations interpolated to a 25x25km grid. As indicator of crop biomass, we extracted 10-day composites of NDVI (Normalized Difference Vegetation Index) from MODIS (Moderate-Resolution Imaging Spectroradiometer) imagery, at 250m of spatial resolution. Both meteorological and NDVI variables were spatially averaged to the same administrative regions as in the reference (crop yield) data. The regional time series of the explanatory variables were aggregated to monthly averages to obtain meteorological and remote sensing factors. Each of these was correlated with wheat yield time series through a linear regression approach, and then ranked, region by region, in terms of root mean square error (RMSE). This ranking was performed for every month of the growing season to assess whether the relative importance of the factors changed over time. Next, we used a hierarchical clustering approach to cluster regions with similar behaviour according to the main factors describing their spatial distribution within the EU. Lastly, we analysed the robustness of the main factors of each cluster in terms of whether they accurately predicted wheat yield in years with significant yield loss.
Based on this analysis, we describe, in a consistent way for all the wheat producing regions of the EU, the main factors that explain interannual yield variability of soft and durum wheat at regional scale, and how the relative importance of these factors varies across the growing season. Secondly, with the clustering analysis, we identify regions with similar explanatory patterns that can be considered as a single unit for statistical analysis, thus providing a possible solution for small sample size in regional analysis. Finally, we show that the linear regression approach is not sufficiently robust in years with significant yield loss. These findings provide a valuable baseline for wheat yield forecasting at regional scale in the EU.
NASA Harvest is the NASA’s Food Security and Agriculture program. Its main objective is enhancing the use of satellite data in decision making related to food security and agriculture. Within this context, one of the main priorities is providing valuable information on crop conditions and accurate and timely crop yield forecasts. This work presents the Agriculture Remotely-sensed Yield Algorithm (ARYA), a new EO-based empirical winter wheat yield forecasting model. The algorithm is based on the evolution of the Difference Vegetation Index (DVI) from the Moderate Resolution Imaging Spectroradiometer (MODIS) at 1 km resolution and the Growing Degree Days (GDD) from reanalysis MERRA2 data. Additionally, the model includes a correction on crop stress conditions captured by the accumulated daily difference of the Land Surface Temperature (LST) from MODIS and the air temperature at the MODIS overpass time from MERRA2. The model is calibrated at subnational level using historical yield statistics from 2001 to 2019. In each administrative unit, a different calibration coefficient (based on all possible combination of the three regressors in a linear model) is selected depending on the statistical significance of each variable. The model was applied to forecast the national and subnational winter wheat yield in the United States, Ukraine, Russia, France, Germany, Argentina and Australia (over 70% of global wheat exports) from 2001 to 2019. The results show that ARYA provides yield estimations with 5-15 % (0.3 ± 0.1 t/ha) error at national and 7-20 % (0.6 ± 0,1 t/ha) error at subnational level starting from 2 to 2.5 months prior to harvest.
Additionally, in this work we explore the applicability of ARYA at within-field scale and how high resolution data can help improving ARYA. Therefore, we test the ARYA calibration equations with Sentinel 2 data and evaluate the results to forecast within-field wheat yield measurements from harvester machines over more than 100ha in Valladolid (Spain) during the 2020 and 2021 seasons.