Data

Understand the data behind the platform

Explore how ecosystems are mapped, learn how to use the data, browse our data sources, and find answers to your questions.

Methodology

Data Compilation

The Global Ecosystems Atlas synthesis map is developed by searching for, and compiling, existing spatial data products that are intended to represent ecosystems ('source datasets'). Searches focus on publicly available data repositories, datasets associated with the scientific literature, and through a program of coordinated outreach to national environment agencies and ecosystem map developers.

Each source dataset is subject to a rigorous evaluation and quality assessment that includes assessment of class definitions, validation protocols, accuracy assessments, data currency, spatial resolution and licensing conditions.

The evaluation protocol ensures the inclusion of data that is suitable for representing ecosystems and meets consistent data quality and metadata reporting standards. The results of the data compilation phase, prior to the application of the evaluation protocol, are included in the Sources Catalogue

Methodology

Correspondence to the IUCN Global Ecosystem Typology Methodology

Cross-referencing map classes from the source datasets to the ecosystem functional groups defined by the Global Ecosystem Typology is a critical process that enables the development of the gea_synthesis data product. This process involves an evaluation of the membership relationship of every map class in each source dataset to ecosystem functional groups defined and described by the Global Ecosystem Typology. Ecosystem functional groups represent level 3 of the Global Ecosystem Typology and are nested within biomes (level 2), which in turn are nested within realms (level 1).

A systematic membership analysis is conducted for each class in a source dataset, according to cross-referencing guidelines developed by the IUCN Global Ecosystem Typology working group. The analysis focuses on a comparison of properties of a given map class with each ecosystem functional group, including class definitions, spatial relationships, and through the incorporation of expert advice. Map classes in the source dataset that are estimated to have little or no membership to an ecosystem functional group because of fundamentally different classification schemes, or those that are unresolved due to partial mismatch in class definitions or uncertainty, are excluded from the gea_synthesis data product. The data mask layer indicates, per-pixel, where class memberships remain unresolved. Unresolved classes typically require further expert input, more data to estimate memberships, and are therefore recorded in the Global Ecosystem Atlas issues tracker (see below).

For each source dataset, the Global Ecosystems Atlas maintains a set of formatted classification cross-reference tables that are accessible on GitHub

Methods for evaluating correspondence to the IUCN Global Ecosystem Typology

Source Data Review

Each source dataset is thoroughly reviewed to understand its classification system and how it relates to the Global Ecosystem Typology. In most cases, the Atlas science team works with the team who developed the source map to understand the provenance and intent of the data. Only datasets that are conceptually aligned to the Global Ecosystem Typology (i.e. that were developed to represent ecosystems), either as a whole or in part, proceed to formal evaluation for inclusion in the gea_synthesis data product.

Logical Mapping

Cross-referencing is typically conducted by the source map developer or by the Atlas science team with extensive input and review by the source map developer. Firstly, short and long descriptions of source map classes are recorded in the Global Ecosystems Atlas cross-reference tables. Each class is systematically assigned to an ecosystem functional group through the analysis of a range of properties of the data. Some source data classes correspond directly with one ecosystem functional group, while other source classes may only partially correspond to an ecosystem functional group. Partial mismatches, where one map class from a source dataset is evaluated as potentially corresponding to more than one ecosystem functional group, may occur. In these cases, the cross-reference protocol uses expert advice on the most likely match as the final assignment in the synthesis map (>50% match). The cross-reference tables, one per source dataset, enable each class in a source dataset to be assigned to its corresponding ecosystem functional group and are used as input into the geospatial data processing pipeline.

Expert review

Each cross-reference decision is reviewed by a qualified expert from the Atlas science team and offered back to the source data developers for additional review and comment.

Technical Approach

For each dataset, cross-reference tables are formatted by a scripted processing pipeline into the text inputs necessary for implementing geospatial reclassification tools, which are then implemented directly within the broader data synthesis pipeline. A numerical system that specifies numeric values of each ecosystem functional group enables the processing of both vector and image data into the gea_synthesis data product, where each pixel or vector element is assigned a pixel value that corresponds to a specific ecosystem functional group.

Data Processing and Quality Control

  • Spatial Processing

    The gea_synthesis data product is developed with a fully scripted processing pipeline that ingests source data, projects to a common coordinate system, transforms to a common spatial data format (Cloud-Optimised Geotiff) and aligns to a common spatial origin. All source datasets are resampled to the 100m pixel resolution of the gea_synthesis data product.

  • Reclassification

    Each source dataset is reclassified using the cross-reference tables described above (see logical mapping) and processed into the multiple data layers that represent the three upper levels of the hierarchical Global Ecosystem Typology (realms, biomes and ecosystem functional groups). The result is the set of data layers that make up the synthesis data product: a set of cloud-optimised GeoTIFFs where each pixel value corresponds to a specific ecosystem functional group, biome and realm.

  • Quality Assurance

    Multiple layers of quality control are applied, including spatial accuracy checks, logical consistency tests, and expert reviews. A set of quality assurance data layers propagate per-pixel information about the source data, including the spatial resolution, time period of source data production, and areas where there are overlaps and/or disagreement among independently developed source datasets. A data mask depicts pixels with valid data, which are pixels where an ecosystem functional group has been identified and served in the synthesis product; no data, where data has yet to be obtained for the synthesis product; and unresolved cross-referencing outcomes, which are pixels that could only be partially matched (<50% membership analysis), were not able to be cross-referenced to an ecosystem functional group, or where further work is required to resolve uncertainties about their membership to an ecosystem functional group.

Sign up

Be the first to know about the full launch of the Global Ecosystems Atlas.