Closely related to errors in weather observations, a number of factors may introduce irregularities in the data. If these irregularities are not removed, the data cannot be used to make conclusions regarding warming trends or climate change. These irregularities include everything that may influence weather observations apart from the actual weather, such as:
- Land modification or changes in the environment surrounding weather stations, which may cause an artificial trend to emerge in the data over long time periods. Artificial warming may be caused for example where a weather station was once located in an undeveloped area, which developed over time into a city with tar roads and high-rise buildings – known as urban heat islands (UHIs). On the other hand, artificial cooling may be caused for example where large-scale irrigation systems are installed. Although these artificial trends may in fact correctly describe the temperature history of the specific location, it obfuscates the true atmospheric condition over time.
- Relocation of weather stations, which will have the same effect as the point above. Moving a station from a cool to warm area, or from an open area to a built-up location will introduce an incorrect pattern in the data.
- Changes in measuring methods and instrumentation, which will also introduce incorrect trends. Changing the type of thermometer, the type of Stevenson screen, or the method of averaging data from electronic sensors may all introduce unwanted artifacts into the weather record.
The process of detecting and removing these irregularities or non-climatic effects from climate data is known as homogenisation. The aim is to correct artificial changes (typically by adjusting raw measured data) to create a homogeneous record, which is presumably more consistent over time and that more accurately reflects the true climate history. Such a record can then be used to draw conclusions regarding long-term climate patterns. The process of homogenising climate data relies on one or more of the following.
- Statistical methods to identify and correct abrupt changes or breakpoints in data series
- Metadata or station history documents, which contain dates and other information regarding changes to a weather station and its equipment.
- Parallel data records, or similar weather measurements taken at neighbouring weather stations over the same time period. The methods mentioned in point 1 are typically used on these parallel records in combination with the data to be homogenised from the candidate station.
Several homogenisation techniques have been developed and differ in how much emphasis is given to each of the above components. Homogenisation techniques can furthermore be classified as follows.
Objective and subjective homogenisation
Objective homogenisation techniques detect changes and adjust the data automatically according to some algorithm. The advantage of these techniques is their reproducibility and ease of processing large datasets. Practically however, it may be difficult to obtain full objectiveness as there may be different ways of implementing an algorithm in software, and human intervention may still be required in a small subset of the data (e.g. a few border-case data samples that need special attention).
Subjective homogenisation techniques mostly rely on judgments made by climate experts. Subjective assessment is especially used when dealing with incomplete historical measurements and metadata where large uncertainty is present, e.g. where records are inconclusive or contradictory, or where a variety of sources is used (e.g. newspaper archives documenting changes in weather station locations). The use of subjective methods is further justified based on the unique circumstances and history of each individual weather station.
Absolute and relative methods
Absolute homogenisation methods only consider individual station records in isolation, including measurements and metadata for each station. These methods are therefore limited to applying statistical tests on single time series, and cannot distinguish between natural and artificial causes of discontinuities, except if supported by station metadata. The capability of detecting true climate signals from single station records is therefore also limited.
Relative homogenisation methods compare data from the candidate station (to be homogenised) with neighbouring or reference stations. For example, the difference time series between the candidate and reference stations can be used to detect inhomogeneities, assuming nearby stations are sufficiently synchronous (as they are exposed to approximately the same climate signal). The performance of relative methods can be improved by selecting reference stations according to some criteria, such as ensuring that each reference station is homogeneous over a specified time period and that it is highly correlated with the candidate station.