This overview aims to provide some background information about the typical input datasets which are used widely in a wide range of biogeochemical and related applications as well as introducing some standard techniques of conversion and scaling required to use different sets in conjunction.

Meteorological or climatological data can originate from various sources. The choice of a relevant data set for application in ecological models depends on availability of the data, but the different sources have different strengths and weaknesses as well, which will be highlighted below.
Increase in spatial resolution of GCMs
Fig. 1. Increase in spatial resolution of
global climate models between the IPCC's
first (1990) and fourth (2007) assessment
reports (Source: IPCC Fourth assessment
report, Working Group I: The Physical
Science basis, Figure 1.4.)

Data sets can be characterized by their spatial and temporal domain and resolution.

Specific datasets: Land cover / land use recent data, historical data and projections

Land cover and land use are often used synonymously. While land cover describes the physical cover of the land (i.e. the type of vegetation), land use is a classification of the actual use of the land by humans. Since many and cover classes directly reflect the use of humans, e.g. croplands, both data sets are strongly linked to each other.

The land cover can either be determined by field surveys or remote sensing. While some fine scale maps (e.g. within a municipality) rely on field surveys, the majority of datasets available is derived from remotely sensed reflectance data.

One of the most commonly used European wide data set for land cover is the CORINE land cover (CLC2000) data. It originated from an effort of the EEA to generate and regularly update a common database with similar classification schemes. CLC2000 is an update of a previous version (CLC90). Hence also land cover (or in some cases land use) change effects can be analysed. However, since the classification schemes have been adapted and the resolution of the input data increased between the two datasets some inconsistencies exist between the two data sets. Since each individual partner state is performing the classification, the different states provide different fine scale data sets in different formats. Additionally, a harmonized European gridded dataset with a 500 m resolution is available with the following classification scheme.

The CLC2000 data can be used to infer the current land use at a certain resolution. However, since the current vegetation is a product of previous land use activities also past land use is of interest in a number of applications.

The Historical Database of the Global Environment (HYDE3.1, Klein Goldewijk et al., 2011 and references therein) is a spatially explicit dataset of human-induced global land-use. It is based on population density historical records, remotely sensed images of contemporary land-cover and land-census data and modelling. HYDE3.1 defines the relative share of croplands and pasture lands during the period 10000 BC to 2000 AD (with a millennial resolution before AD 0, a centennial resolution until AD 1700 and a decadal resolution thereafter).

KK10 data
Fig. 2. Fraction of gridcell under natural vegetation from KK10 dataset.

The Kaplan and Krumhardt 2010 dataset (KK10, Kaplan et al. 2010 and references therein) consists of an annually resolved time series of anthropogenic deforestation over the past 8000 years. KK10 was created by by 1) digitizing and synthesizing a database of population history for Europe and surrounding areas, 2) developing a model to simulate anthropogenic deforestation based on population density that handles technological progress, and 3) applying the database and model to a gridded dataset of land suitability for agriculture and pasture to simulate spatial and temporal trends in anthropogenic deforestation. The simulations are based on historical observations and estimates of population density, climate, soil properties, land suitability for cultivation and pasture and take into consideration technological developments.

Both the HYDE 3.1 and the KK10 datasets have a spatial resolution of 0.5°x0.5°. The KK10 scenario makes the central assumption that humans use land more intensively in all regions of the world with increasing population density and land scarceness. In contrast, the standard version of the HYDE is based on a nearly linear relationship between population and area of land under agriculture, and shows very little variation in per capita land use.


Kaplan, J., Krumhardt, K.M., Ellis, E.C., Ruddiman, W.F., Lemmen, C., Klein & Goldewijk, K. (2010) Holocene carbon emissions as a result of anthropogenic land cover change. The Holocene, doi: 10.1177/0959683610386983.

Klein Goldewijk, K., Beusen, A., van Drecht, G. & de Vos M. (2011) The HYDE 3.1 spatially explicit database of human-induced global land-use change over the past 12,000 years. Global Ecology and Biogeography, 20, 73-86. doi: 10.1111/j.1466-8238.2010.00587.x