This post is about Sentinel-2 multispectral data.
SENTINEL-2 is a high-resolution, wide-swath imaging mission that supports the Copernicus Land Monitoring program. It plays a key role in observing vegetation, soil, water bodies, and coastal regions, as well as inland waterways. The satellite's Multispectral Instrument (MSI) captures data across 13 spectral bands: four bands at 10-meter resolution, six at 20 meters, and three at 60 meters. The Level 2A data is atmospherically corrected using the Sen2Cor processor for improved analysis.
Figure 7 - Quick visualisation of first and last observation in the timeseries.
Figure 8 - Left Image doesn't have clouds; Right image shows cloud, and cloud shadows
The observation on the right is affected by clouds and cloud shadows.
Clouds degrade the composite image: while cloud shadows are removed using the max operation, the bright clouds still interfere with the overall quality. To improve the usability of raw data, it’s generally important to remove cloud-covered pixels and work only with cloud-free data. In Earth observation data, it’s common to have separate masking layers that indicate whether a pixel is cloud-covered. For Sentinel-2, this is done using the "scene classification" layer (SCL), produced by the Sen2Cor algorithm.
With openEO and the openEO Python client, we can use the SCL band (already included in the load_collection call) to apply cloud masking as follows:
First, we create a binary cloud mask based on SCL values: 3 for cloud shadows, 8 for medium-probability clouds, and 9 for high-probability clouds. This cloud masking significantly improves the composite image. However, some artifacts remain due to the quality of the SCL band and the simplicity of the cloud mask.
Figure 9 - Clouds in the composite
Clouds impact this composite: while the max operation removes cloud shadows, the bright clouds still degrade the image quality.
Cloud masking has noticeably improved the composite. However, some artifacts remain due to the limitations of the SCL band and the simplicity of our cloud mask.
Cloud Masking in NDVI
The result above provides a solid foundation, but there's potential for improvement to achieve smoother NDVI profiles.
Many outliers remain because we didn't filter out cloudy observations or pixels. We can address this by using the "SCL" (scene classification) band from the "SENTINEL2_L2A" collection to focus only on cloud-free pixels. Let's reload the "SENTINEL2_L2A" data cube, including the "SCL" band, and calculate NDVI as before.
However, this mask can be noisy due to imperfect classification. To be more cautious, we can expand it slightly to exclude additional pixels at cloud edges. This can be done by performing a morphological operation, using a convolution with a Gaussian kernel and applying a threshold to get a refined binary mask.
Timeseries Smoothing
As the final step in this process, we'll introduce an openEO user-defined function (UDF) to the workflow. A UDF allows you to submit a snippet of code, such as Python, to be executed on the backend. In this case, we'll define a UDF to:
Interpolate missing values (caused by cloud filtering)
Apply a Savitzky-Golay filter for temporal smoothing of the timeseries (using scipy.signal.savgol_filter)
While it's possible to load a UDF from an external file, we'll load it as an inline snippet here:
Figure 10 - Timeseries Graph
Figure 11 - Timeseries Graph of Masked Data
Figure 12 - Timeseries Graph of Smoothed Data
The red line in the plot represents NO2 concentrations during the COVID lockdown period in a specific area of Delhi, India, while the green line shows the levels during the same months in the post-COVID period. A slight reduction in NO2 levels is observed during the lockdown, indicating lower air pollution at that time. As restrictions eased, NO2 levels rose again. Additionally, the data highlights that air pollutant concentrations in Delhi are generally higher between November and April compared to May and September.
Other scenarios can be explored with Sentinel-5P data within the Copernicus Data Space Ecosystem, such as analyzing PM2.5 concentrations, ozone layer depletion, and SO2 levels.
Figure 13 - Comparison Graph of NO2 levels during Covid, and Post Covid
Radar - Sentinel-1: ARD SAR Backscatter
In certain cases, the preprocessed data collections available on openEO backends may not meet specific needs or may be inappropriately preprocessed. openEO offers processes to handle common preprocessing tasks, such as:
Atmospheric correction of optical data
SAR backscatter computation
These processes come with customizable parameters to tailor the processing to your requirements.
However, keep in mind that these operations can be computationally intensive, potentially increasing the overall processing time and cost. It's important to make informed choices when utilizing these methods.
This notebook is adapted from an existing sample pipeline for Radar ARD on the openEO platform. In this version, we demonstrate it with the Copernicus Data Space Ecosystem backend.
On-demand SAR Backscatter
Data from synthetic aperture radar (SAR) sensors requires extensive preprocessing for calibration and normalization, referred to as backscatter computation. This is facilitated in openEO using the sar_backscatter process.
In this example, the radiometric correction coefficient used is "sigma0-ellipsoid."
Figure 14 - Radiometric Correction of Imagery
Sentinel-3 OLCI
The OLCI dataset provided by Sentinelhub is derived from level-1b products, which are delivered in "instrument" projection rather than a ground-based reference system. As a result, these products do not have a 'native' spatial reference system. In openEO, the collections are configured to use unprojected coordinates in EPSG:4326, with a fixed resolution designed to approximate the native 300m ground resolution.
Figure 15 - Sentinel-3 OLCI Imagery, and Bar Graph
User-Defined Processes (UDP) in openEO
OpenEO enables users to chain processes together in a process graph to construct custom algorithms. Often, certain (sub)graphs are reused within the same or across different process graphs or algorithms. To streamline this, openEO backends allow you to save these subgraphs as "User-Defined Processes" (UDP), creating a library of reusable openEO components.
This notebook offers a step-by-step guide on how to create and apply a User-Defined Process for a Normalized Difference Water Index (NDWI) use case.
Building a parameterized datacube
The openEO Python client allows you to define parameters using Parameter instances from the openeo.api.processsubpackage. Typically, you need to provide at least the parameter name, a description, and a schema.
Figure 16 - Rescaled Imagery
NDWI (Normalized Difference Water Index) is a vegetation index that indicates the water content within vegetation and complements the NDVI (Normalized Difference Vegetation Index). High NDWI values reflect higher water content in the vegetation.
Figure 17 - NDWI Equation
Figure 18 - Imagery Classified in NDWI
Creating a smoothed dataset using Whittaker
In this notebook, we use the Whittaker algorithm from the FuseTS toolbox as a user-defined function (UDF) to generate a smoothed time series. This algorithm employs a discrete penalized least squares method to fit a smooth series, denoted as zz, to the original data series, denoted as yy.
We will focus on rapeseed data from 2019 in Northern Spain.
First, we will create an openEO process to calculate the NDVI time series for our area of interest. We start by using the SENTINEL2_L2A collection and applying the Sen2Cor cloud masking algorithm to eliminate any cloud interference before computing the NDVI values.
Once we have the NDVI time series, we can request openEO to download the results to our local storage, enabling us to access the file for further analysis within this notebook.
Finally, we will plot the raw NDVI time series, averaged across the parcel.
Figure 19 - Raw NDVI TimeSeries Graph
Figure 20 - Raw NDVI Compared to Smoothed NDVI TimeSeries Graph
Figure 21 - Table of NO2 Levels, Surface Temperatur, and Cloud Fraction in %
Figure 22 - Graph of NO2 Levels Over Time
Comments