Soil moisture is an essential variable in the critical zone. It controls energy exchange at the Earth's surface and major biogeochemical cycles. Its estimation has applications in fields as varied as agriculture, forestry, ecology, continental hydrology, meteorology, defense and planetology. Despite the development of new sensors, the establishment of in situ monitoring networks, improved data assimilation and the launch of new satellites, much progress remains to be made in estimating soil moisture from reflectance measurements in the solar domain (0.3-3 µm). The interpretation of these measurements, which provide information on surface water content, is complex because soil reflectance depends on other factors such as surface roughness, porosity, or organic matter content. Most of the methods used for this purpose are empirical: they rely on spectral indices, statistical relationships, multivariate analysis, wavelet analysis, continuum removal, inverse Gaussian, etc. but they are rarely physically based. In the same way that radiative transfer models have led to considerable progress in the study of plant canopies, their contribution could be very beneficial for the study of bare soils. Some models simulate the reflectance of a smooth, wet soil with a simple representation: they assimilate a wet soil to a dry soil covered with a thin film of water occupying a fraction of its surface, and then calculate the multiple reflections inside this layer. This is typically the case of the MARMIT (multilayer radiative transfer model of soil reflectance) model (Bablet et al., 2018). The latest version, MARMIT-2, includes soil particles in this water layer, creating turbidity. The intrinsic optical properties of this turbid layer can be derived using mixing rules that require a detailed knowledge of the complex refractive index of soil particles (Dupiau et al., 2022). The extension of MARMIT to a rough surface is complex and, to date, there is no model capable of simulating the BRDF of a rough, wet soil. A first approach consists in coupling it to a BRDF (Bidirectional Reflectance Distribution Function) model, for example the Hapke model, widely used in planetary science, or the RPV (Rahman-Pinty-Verstraete) model, which is a simplified version. Another approach is to use a ray-tracing model, for example the DART (Discrete Anisotropic Radiative Transfer) model. This model allows simulating the spectral and directional reflectance of virtual scenes represented by 3D digital terrain models (DTM), as long as the local optical properties are known. The retrieval of soil water content will be performed using spectral indices or by inverting the model on hyperspectral images correlated with in situ soil water content measurements.
Within the www.OpenLandMap.org project we are currently producing the best unbiased predictions of soil organic carbon, soil pH, total N, bulk density and clay content at 1-km spatial resolution for the period 1982–2020 for the global land mask and for the standard depth intervals 0–30, 30–100 and 100–200 cm. For this we use globally compiled and consistent compilation of soil observations & measurements (https://gitlab.com/openlandmap/compiled-ess-point-data-sets; currently over 650,000 georeferenced soil samples) combined with a gridded data cube representing major soil forming processes: climate (rainfall, snow, temperature regime), relief, land use / land cover (HYDE, HILDA+ datasets, monthly NDVI) and land degradation / deforestation datasets (see presentation: "Spatiotemporal Earth-Science data Cube at 1-km resolution 1982-2020 to enable dynamic system modeling" on the same conference). These products are major update of the soil carbon global predictions published in Sanderman et al. (2017) and which were still based on a coarser resolution of 10-km. The predictive soil mapping models are build in four basic steps. (1) Covariate layers are prepared to represent soil forming processes accumulation, deposition / erosion. For this we derive cumulative indices that help enhance soil forming processes and quantify cumulative effects, especially e.g. cumulative rainfall and snowfall. (2) Target variables are overlaid vs time-series of EO data products representing using year as the temporal reference. (3) An Ensemble Machine Learning model is fitted using iterative feature selection, fine-tuning / optimization through cross-validation. (4) predictions are used to generate time-series of predictions, which are then analyzed for trends and overlaid with other Earth System science products such as GPP dynamics and similar. The results show that especially soil pH can be predicted with reasonable accuracy (CCC=0.72), while models for soil organic carbon (CCC=0.65) and total N (CCC=0.60) are somewhat less accurate, which might be also due to the harmonization problems between various datasets.
Our key justification for choosing this framework (the spatiotemporal Machine Learning for large span of years) is that soil properties primarily change gradually as an effect of past various accumulation (rainfall, primary productivity) and erosion processes, and can thus not be easily detected directly by using e.g. Sentinel-2 products (see e.g. Castaldi et al. (2016) and Castaldi et al. (2019)). In other words, we assume that soil properties such as soil organic carbon are not effect of current climatic conditions, but of accumulated, past conditions running over longer periods of time. Compared with the method described in Heuvelink et al. (2021) where soil samples are overlaid and correlated with current values of climate and NDVI, in our approach we look primarily at correlating soil properties with soil forming processes. Sudden change in land cover (forest cover) can possibly result in more abrupt changes in soil properties, but even then we assume that the erosion processes would takes at least few years until the soil properties would change significantly. We hence assume that the model should be hybrid i.e. that it should accommodate both abrupt and gradual changes, however, majority of the covariates should reflect the long-term processes. As a compromise between data availability and spatial resolution and temporal scale of soil formation, we have finally decided to model soil dynamics for the period 1982-2020, as we assume that this period is long enough to quantify and detect significant changes in chemical soil properties of interest. Modeling soil distribution using such a long time-span is also important to be able to predict future soil conditions. Once the models for SOC, total N and pH are calibrated for longer span of years, they can also be used to predict future state of soil properties assuming different climate scenarios (e.g. IPCC AR6).
Although the target spatial resolution of the spacetime modeling framework is limited, this is an unprecedented product because our predictions cover 40-yrs time-period and can hence be used to detect positive and negative trends (e.g. soil carbon losses and salinization / acidifications processes) and potentially can be used to predict future state of soil assuming different climate scenarios (as an input for projects such as https://probablefutures.org/, https://landchangestories.org/ and similar).
Time-series of predictions of soil properties for three standard depths (0-30 cm, 30-100 cm and 100-200 cm) will be made available via the OpenLandMap.org data portal under the CC-BY license and through our STAC library allowing for unrestricted access and use in early 2022.
References:
1. Castaldi, F., Chabrillat, S., van Wesemael, B., Castaldi, F., Chabrillat, S., van Wesemael, B., 2019. Sampling Strategies for Soil Property Mapping Using Multispectral Sentinel-2 and Hyperspectral EnMAP Satellite Data. Remote Sens. 11, 309. https://doi.org/10.3390/rs11030309
2. Castaldi, F., Hueni, A., Chabrillat, S., Ward, K., Buttafuoco, G., Bomans, B., ... & van Wesemael, B. (2019). Evaluating the capability of the Sentinel 2 data for soil organic carbon prediction in croplands. ISPRS journal of photogrammetry and remote sensing, 147, 267-282. https://doi.org/10.1016/J.ISPRSJPRS.2018.11.026
3. Hengl, T., Miller, M. A., Križan, J., Shepherd, K. D., Sila, A., Kilibarda, M., ... & Crouch, J. (2021). African soil properties and nutrients mapped at 30 m spatial resolution using two-scale ensemble machine learning. Scientific Reports, 11(1), 1-18. https://doi.org/10.1038/s41598-021-85639-y
4. Heuvelink, G. B., Angelini, M. E., Poggio, L., Bai, Z., Batjes, N. H., van den Bosch, R., ... & Sanderman, J. (2021). Machine learning in space and time for modelling soil organic carbon change. European Journal of Soil Science, 72(4), 1607-1623. https://doi.org/10.1111/ejss.12998
5. Sanderman, J., Hengl, T., & Fiske, G. J. (2017). Soil carbon debt of 12,000 years of human land use. Proceedings of the National Academy of Sciences, 114(36), 9575-9580.
Given rapid environmental changes and the emergence of high spatiotemporal Earth Observation (EO) data, innovative solutions are needed to support policy frameworks and soil-related monitoring and reporting activities towards sustainable development. Since 2015, the Sentinel-2 constellation has provided an uninterrupted multispectral data record of the Earth's land surface, offering scientists new possibilities to better understand and quantify changes in soil ecosystem. Fully exploiting these space-borne data requires new approaches for their pre-processing and analysis, and along with the recent developments in deep learning domain, a great potential to revolutionize the processing of EO data is foreseen.
Here, we present the results from a Soil Organic Carbon (SOC) content mapping approach credited to Sentinel-2’s spatial, spectral, and temporal characteristics, enabling the differentiation within agricultural parcels. Leveraging key functionalities from the existing Soil Composite Mapping Processor (SCMaP), we analysed the Sentinel-2 archive between 2015-2020 taking into account 12 tiles in order to cover the entire Bavarian territory. SCMaP derived analysis ready data supports generation of soil reflectance composites (SRC) representing different per-date temporal mosaicking products. In this study, a three-year approach considering only the spring months (spring-SRC), the spring and autumn months (autumn-SRC) and the entire period (full-SRC) were tested. Then, the current framework makes use of a recently developed multiple-input Convolutional Neural Network (CNN) algorithm that has been introduced within the framework of ESA WORLDSOILS project. The proposed CNN leverages the techniques of one-dimensional CNNs to estimate SOC from a spectral signature which has undergone multiple pre-processing treatments. The initial SRC reflectance values, their conversion to absorbance and the application of a standard normal variate have been seen by the network as different channels constructing a distinct feature space. We developed and calibrated our CNN algorithm based on samples (calibration = 80% and validation 20%) of the soil information that correspond to a compilation of different databases, including LUCAS 2015, and the information from national data archives.
The findings of our analysis indicates that the spring-SRC performs slightly better compared to other SRCs. Based on that, we concluded that the use of time restricted SRCs address the problem of poor SOC predictions caused due to disturbing factors at the soil surface by targeting the periods within reduced seasonal series marked by driest soils over a year. Overall, a promising prediction performance (R2 = 0.60, RMSE = 12.60 gC/kg, RPD = 1.59) was achieved by using the CNN model, demonstrating an improvement of more than 9% in RMSE using the spring-SRC together with current state-of-the-art machine learning methods such as partial least square regression (R2 = 0.49, RMSE = 13.76 gC/kg, RPD = 1.4).
Looking to the future, the proposed approach can be adopted on the forthcoming hyperspectral orbital sensors to expand the current capabilities of the EO component by estimating more soil attributes with higher predictive performance.
Soil organic carbon (SOC) is essential for preserving and maintaining a range of soil and ecosystem functions as well as supplying and storing carbon for climate change mitigation. Digital Soil Mapping techniques were used to obtain a spatially continuous predictions of SOC, especially over permanently vegetated areas. Recently available satellite earth observation (EO) data, with among other systems the Copernicus Sentinel, were used as input for environmental covariates.
This work focuses on soil organic carbon mapping on vegetated areas with digital soil mapping (DSM). The approach suggested for permanently vegetated areas was implemented within the test case area of Bavaria (DE).
DSM is a well-established approach to model and map soil properties at un-surveyed locations. DSM techniques use legacy in situ soil data and relate them to spatially explicit environmental information describing the so-called SCORPAN (soil, climate, organisms, relief, parent material, age and site) factors. Among these SCORPAN factors, organisms (within soil and above) play a dominant role in explaining the variability in SOC at the regional to landscape scale. A statistical relationship between measured soil properties and soil forming factors (terrain, vegetation, climate to name a few) as measured by environmental covariates was established. Earth observation products are particularly effective in providing covariates characterising the role of organisms such as land use (change) maps and indicators for C input from the vegetation.
The EO covariates were derived from the Soil Composite Mapping Processor – SCMaP. The pre-processing contains the preparation of the stack of EO images that was used for the SOC retrievals. It is based on per-pixel composites from time series of EO imagery from the Sentinel 2 archive. Soil SOC observations at various locations in Bavaria (DE) were related to the previously described environmental covariates. The soil observations were split in 10 equally sized folds. Model tuning was performed with a 10-fold cross-validation procedure applied to multiple combinations of hyper-parameters. RandomForest models were obtained with the ranger package, with the option quantreg to build Quantile Random Forests (QRF). With this option the prediction is not a single value, e.g., the average of predictions from the group of decision trees in the random forest, but rather a cumulative probability distribution of the soil property at each location. Predictions were then assessed with classical performance measures, i.e., root mean squared error and model efficiency coefficient.
The resulting models and maps indicated that Digital Soil Mapping, coupled together with high-resolution products derived from Sentinel-2, is a powerful tool to produce soil properties maps and monitoring the changes in soil conditions over time.
Since several years, temporal compositing techniques have been developed for the generation of Soil Reflectance Composites (SRC) that collect bare soil pixels from a multispectral stack of images across a certain time period. In most cases, the principles of the abovementioned compositing techniques are the same: (1) the use of spectral reflectance indices of optical multispectral satellite data to develop a data-base for bare soil pixel selection, (2) the definition of reflectance index thresholds that enable the selection of undisturbed bare soil pixels from the index data-base and (3) the definition of the length and seasonal composition of multitemporal data stack.
The index selection and the defined threshold is essential for the quality and usefulness of the generated bare soil reflectance composites. The limited spectral information of multispectral data such as Sentinel-2 and Landsat is mostly not sufficient to clearly distinguish between Bare Soils (BS) and Non-Photosynthetic active Vegetation (NPV) and also between BS and urban surfaces. A widely used index is the Normalized Difference Vegetation Index (NDVI) to separate the vegetated photosynthetically-active areas from non-active areas in some cases combined with the well-known Normalised Burn Ratio 2 (NBR2) or the Bare Soil Index (BSI) to optimize the spectral confusion of BS with NPV. However, the index threshold values vary widely across the studies and test sites indicating the necessity of a site-specific definition of spectral index thresholds. This can be a very time-consuming task, especially for automated processors such as the Soil Composite Mapping Processor (SCMaP) that are designed to produce bare soil composites at the global scale.
In this study, we present a novel technique: Histogram Separation Threshold (HISET), which is generic, allows for regionalized threshold derivation, accounts especially for spectral similarity between BS and NPV and works independently from the selection of specific spectral indices. One key advantage of HISET is that it works completely automatic and avoids a subjective manual selection of the threshold value. The basic idea behind HISET is to evaluate the temporal variability of different Land Cover (LC) types such as agricultural fields, forests, water bodies and urban areas measured by spectral reflectance indices.
The threshold is derived from the Probability Density Functions (PDF) of index histograms of two spectrally similar LC types. The LC information are derived for example from all available European CORINE LC and LC change data sets and include areas that are temporally stable across the observed time. Further, the per-pixel minimum index value and maximum index value is stored in temporal composites. The histogram of the min index composite is used to separate BS from NPV and the histogram of the max index composite is used to separate vegetated soils from permanently non-vegetated areas such as urban surfaces and water.
HISET is tested for selective biogeographic regions across Europe: Bavaria (Germany), Czech Republic, Wallonia (Belgium) and Central Macedonia (Greece). Thresholds are derived for NBR2 since this index is widely used in the literature and accounts for NPV. Additionally, a new index integrating the NDVI and SWIR information (PV+IR2) developed by the authors is tested. Further, the thresholds are derived in a strict mode to consequently avoid NPV disturbances and in a more relaxed mode to allow the addition of more predominantely bare soil pixels into the composites. The results show the variability of both index thresholds across the biogeographic regions. It discusses the variability and give a first outlook towards the development of a European-wide bare soil reflectance composite generation.
Soil is a key biotic element to the environment while soil organic carbon (SOC) is the main component of soil organic matter (SOM). SOC contributes to nutrient retention and turnover, soil structure, moisture retention and availability, degradation of pollutants, and carbon sequestration. SOC is an indicator of soil health and essential for food production, mitigation and adaptation to climate change, and the achievement of the Sustainable Development Goals (SDG).
Recent developments in satellite and airborne sensors have sparked the interest in Earth Observation for monitoring soil properties. The spectral and spatial resolution of sensors is gradually increasing together with the capacity for data analysis. Therefore, a range of present and future options exists for the workflow towards an Earth Observation soil monitoring system. The availability and data quality of recent satellite missions like Copernicus Sentinels, have dramatically changed the paradigm, making remote sensing of top soils feasible in a coherent manner from regional to global scales.
The WORLDSOILS project aims at developing a pre-operational Soil Monitoring System to provide yearly estimations of SOC at global scale, exploiting space-based EO data, leveraging large soil data archives and modelling techniques to improve the spatial resolution and accuracy of SOC maps.
The ambition of the WORLDSOILS Monitoring System (WOSOMS) is to achieve a system with the following characteristics:
• Modular implementation, allowing future additional soil indices into the system.
• Spatial resolution 100m x 100m globally and 50m x 50m over Europe.
• Appropriate confidence metrics provision.
• Use of large time series of a minimum of 3 years.
• Validation over three regions.
WORLDSOILS feasibility studies have focused on intermediate soil organic carbon indexes (bare and vegetated soils, forest and grassland) from a satellite reflectance composite, built from a three to five year-series of satellite observations. Intermediate SOC indexes are selected to cope with land cover diversity and aligned with bare soils and permanently vegetated areas.
The project has systematically assessed the impact on prediction performance of various options at each step of the workflow towards a global EO Soil monitoring system. The options considered consist of current benchmark technology, options available in the near future and realistic scenarios taking into account corrections for potential disturbing factors.
The first step taken towards the development of the WOSOMS has been the compilation of a set of user requirements considering: i) the outcomes of the feasibility and impact assessment studies, ii) existing scientific literature, iii) applicable EO data policies and iv) user needs gathered from potential end-users. The resulting consolidated requirements, agreed with the stakeholders, will be used to design and implement this monitoring system on a suitable cloud environment, presumably one of the Data and Information Access Services (DIAS).
The system will operate during one year over three pilot regions, designed in coordination with National Reference Centres of Soil, representing different bioclimatic regions across Europe with a range of vegetation types, land use and soil types. The case studies will be based on data acquired during the operations phase in addition to data from the previous two years (three-year time series).
The project will culminate with a final symposium to present and discuss the results of the validation, and will present the starting point for the future evolution and enhancement of the system.
WORLDSOILS is an application project funded by the European Space Agency and executed by GMV (prime contractor), Université Catholique de Louvain (UCL), Aristotle University of Thessaloniki (AUTh), German Research Centre for Geosciences (GFZ), German Aerospace Center - Remote Sensing Technology Institute (DLR-IMF), ISRIC, Czech University of Life Sciences Prague (CZU) and Tel-Aviv University (TAU).