Big data is common in geographical research, most of which is spread across different locations and times. It is usual for researchers to take days, even months, to complete basic calculations for those data. For the torrent of movement points—over 35 million—collected by the British Trust for Ornithology, is there a better way to see how ‘wet’ areas near the movement paths of Greater Spotted Eagles are?
Large Eagles, Even Larger Data: Methods for Handling Large Satellite Datasets for Monitoring Eagle Movement in Europe
Geographical Information System and Data Processing | Adele Gao
The Polesia region in Eastern Europe is the stronghold for the Greater Spotted Eagle population. According to Project Polesia - Wilderness Without Borders, the region is home to more than 16% of Europe’s endangered species, with the eagle being a totemic example that has both ecological and cultural importance [1]. This project is focused on supporting a research effort into how eagles in the Polesia region change their behaviour based on environmental differences in the wetlands. In other words, how ‘wet’ is the landscape surrounding the movement points of eagles?
The British Trust for Ornithology gathered eagle movement data, and we have used Google Earth Engine to assess the normalised difference water index (NDWI) around the movement points. Satellite data is the only viable way to monitor the environmental conditions due to Russia’s invasion of Ukraine. The Polesian region is located half in Ukraine and half in Belarussian territory - incorporating the Chernobyl exclusion zone. The combination of war and radioactive contamination means that remote sensing is a vital tool in monitoring the affected ecosystems.
We calculated the NDWI for daily areas of interest (AOI) generated from the coordinates from the movement data based on dates. We also calculated the sample movement points that were collected between October 1 and October 2, 2019 in order to examine some of the issues around the data handling in more detail. The movement information as a whole is a huge dataset of 35 million points that presents significant challenges in extracting the environmental parameters from the satellite data record. Our work therefore seeks to enable chronological and geographical comparisons of the NDWI values that could then be used to monitor the change in wetlands across the region.
Figure 1: Map of the study region, including eagle movement points and area of interest with NDWI values. Example data refers to movement points from October 2, 2019. Image by Adele Gao using QGIS.
Method
Charlie Russell and Dr Adham Ashton-Butt from the British Trust for Ornithology provided the movement data used in this work. The raw data contains 35,610,268 points across the Polesia region from 2018 to 2022. These 35 million data points are the core of the problem faced. How can we process them in an effective manner, whilst also managing the tera-bytes of satellite data that will be providing the contextual environmental information?
Data Pre-processing
We wanted to compare the NDWI values to study wetland changes chronologically. Therefore, we sorted the movement points based on their recording time using functions from the Pandas library. We then filtered through the grouped data for each day’s maximum and minimum latitude and longitude. The new data frame calculated the daily NDWI based on the AOI created with the daily maximum and minimum latitude and longitude as coordinates. It is this step that enables us to batch process the movement data record through the Google Earth Engine API on a daily basis, whilst remaining under the spatial extent quota.
Calculation of NDWI for daily AOIs
The calculations used Band 3 and 8 from the Sentinel-2 collection with a pixel scale of 10 m. It took nine minutes to finish the calculation for the 963 daily AOIs. This compares to it taking three months if we were to process each (35 million) point one by one through the earth engine API. We also used Band 1 and 2 from MODIS with a pixel scale of 250 m. It took 12 minutes to finish the calculation for the 963 daily AOIs.
The same AOI had different NDWI values from the calculations based on the two different datasets. The difference between the two results were less than 0.8, and the pixel scale difference between the MODIS and Sentinel-2 bands collection was around 200-230 m for this calculation. The pixel scale drives the difference in observed values as the contents of each pixel is fundamentally different between the sensors.
Calculation of NDWI for each point
For comparison, we calculated the NDWI for each movement point in our October subset. We extracted 3,108 movement points from October 1 to October 2, 2019. The calculation using Sentinel-2 data (Bands 3 and 8 with a pixel scale of 10 m) had a runtime of 20-23 minutes for each of the 3,108 points. The results were incomplete, with 209 points returning null. The MODIS calculation used Bands 1 and 2 with a pixel scale of 250 m to calculate the NDWI for the 3,108 points. Each point returned a valid NDWI value, and each point’s runtime was 5-6 minutes. Therefore, the daily AOI approach is the preferred method moving forward.
Figure 2: Graphic representation of differences between NDWI for daily AOI calculated based on Sentinel-2 and MODIS (NDWI_MODIS – NDWI_Sentinel-2 = Differences in values). Image by Adele Gao in Google Colab.
Figure 3: Flow chart of NDWI calculation. Image by Adele Gao using Microsoft PowerPoint.
Further work
So far, we have only calculated NDWI as the indication for wetness; there are other factors to be considered in the context of the Polesia region, such as soil moisture and precipitation for the AOIs, which could be gained from a range of other sensors. Further work is required to develop these data streams and integrate them into the time-AOI batch processing approach.
Since evaluating the relationship between the eagle movement and wetness of the environment required more than one environmental measurement, we are unable to draw conclusive findings on correlation between eagle movements and the environment. However, the method for calculating the NDWI is a step forward in simplifying the analysis for big data and can be implemented to other geographical data processing.
Conclusion
Using the cloud processing abilities of Google Earth Engine, we were able to extract and calculate an environmental parameter from a large satellite data record, in a fraction of the time it would have taken with a point-by-point approach. The British Trust for Ornithology now wants to take this approach and apply it to their other wildlife monitoring programmes, expanding on the range of sensors and parameters used to truly monitor the wetland conditions as experienced by the animals living within them. Given the on-going war in Ukraine, this remains the only way to monitor how vulnerable species are coping with both the impact of the war itself and our changing climate.
Acknowledgements
We sincerely thank Dr Adham Ashton-Butt and Charlie Russell (British Trust for Ornithology) for providing the valuable movement data and feedback on the methods. Many thanks go to Dr Thomas Dowling for advice, encouragement, and support throughout the research process.
Glossary
Normalised Difference Water Index (NDWI) are remote sensing-derived indexes related to liquid water. For this research, we used the NDWI to detect water bodies alongside the eagles’ movement points. For the Sentinel-2- based calculation, we used “Green” Band 3 and “Near Infrared (NIR)” Band 8 [2]. For the MODIS-based calculation, we used “Near Infrared (NIR)” Band 2 and “Shortwave Infrared (SWIR)” Band 6 [3].
Formula for Sentinel-2-based calculation:
NDWI = (Band 3 - Band 8)/(Band 3 + Band 8)
Formula for MODIS-based calculation:
NDWI = (Band 2 - Band 6)/(Band 2 + Band 6)
Sentinel-2 is a constellation of two identical satellites in the same orbit, imaging land and coastal areas at high spatial resolutions (10 m, 20 m, or 60 m) in the optical domain. [4]. The revisit frequency of each single Sentinel-2 satellite is 10 days, and the combined constellation revisit is 5 days [2].
Moderate Resolution Imaging Spectroradiometer (MODIS) is a key instrument aboard the Terra and Aqua satellites. Terra MODIS and Aqua MODIS view the Earth’s entire surface every 1 to 2 days, acquiring data in 36 spectral bands or groups of wavelengths [5].
Discussion
Data availability
Due to the long revisit time of the Sentinel-2 satellites, most of the results returned from the calculation for daily AOIs were null. In contrast, calculations based on MODIS returned complete results for all 963 AOIs due to the daily revisit time of the Terra and Aqua satellites on which MODIS is mounted. However, due to the large pixel scale for MODIS calculations, the results are very different from those based on Sentinel-2, with a difference of 0.8 between the two.
To improve this, we could use combined results and estimate more accurate NDWI values based on the differences between the two sets of calculations. Another plausible approach would be calculating the margin of difference between the two result sets and making estimations for Sentinel-2 results based on such differences. Alternatively, we could use another set of satellite data, for example, Landsat collections, to compare with MODIS and Sentinel-2. However, the results were also incomplete based on test calculations done from the Landsat collection.
For the point NDWI calculation, we chose a small part of the whole set of movement points as sample data based on the availability of daily NDWI results; October 1 to October 2, 2019, returned valid NDWI values when calculated based on their AOI. Sentinel-2 data can return valid results for most movement points. Therefore, we could use the results based on MODIS as a background reference for the final NDWI values.
[1] Wild Polesia, “Our work: Greater Spotted Eagles,” wildpolesia.org.https://wildpolesia.org/greater-spotted-eagles (accessed Mar. 04, 2024).
[2] European Space Agency, “Resolutions,” sentinels.copernicus.eu. https://sentinels.copernicus.eu/web/sentinel/user-guides/sentinel-2-msi/resolutions (accessed Mar. 2,2024 ).
[3] B. Gao, “NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space,” Remote Sensing of Environment, vol. 58, no. 3, pp. 257–266, Dec. 1996, doi: https://doi.org/10.1016/s0034-4257(96)00067-3.
[4] European Space Agency, “Sentinel-2,” www.esa.int. https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-2 (accessed Mar. 2, 2024).
[5] NASA, “MODIS: About,” modis.gsfc.nasa.gov.https://modis.gsfc.nasa.gov/about/ (accessed Mar. 2, 2024 ).
Adele is a fourth-year science student majoring in geographical information science (GIS), focusing on GIS-oriented programming and spatial analysis. She is interested in environmental development, animal geography, and GIS application in urban areas. Outside of study, she enjoys playing video games, her favourite being the Fallout series.