Uncertainty matters: ascertaining where specimens in natural history collections come from and its implications for predicting species distributions
Natural history collections (NHCs) represent an enormous and largely untapped wealth of information on the Earth's biota, made available through GBIF as digital preserved specimen records. Precise knowledge of where the specimens were collected is paramount to rigorous ecological studies, especially in the field of species distribution modelling. Here, we present a first comprehensive analysis of georeferencing quality for all preserved specimen records served by GBIF, and illustrate the impact that coordinate uncertainty may have on predicted potential distributions. We used all GBIF preserved specimen records to analyse the availability of coordinates and associated spatial uncertainty across geography, spatial resolution, taxonomy, publishing institutions and collection time. We used three plant species across their native ranges in different parts of the world to show the impact of uncertainty on predicted potential distributions. We found that 38% of the 180+ million records provide coordinates only and 18% coordinates and uncertainty. Georeferencing quality is determined more by country of collection and publishing than by taxonomic group. Distinct georeferencing practices are more determinant than implicit characteristics and georeferencing difficulty of specimens. Availability and quality of records contrasts across world regions. Uncertainty values are not normally distributed but peak at very distinct values, which can be traced back to specific regions of the world. Uncertainty leads to a wide spectrum of range sizes when modelling species distributions, potentially affecting conclusions in biogeographical and climate change studies. In summary, the digitised fraction of the world's NHCs are far from optimal in terms of georeferencing and quality mainly depends on where the collections are hosted. A collective effort between communities around NHC institutions, ecological research and data infrastructure is needed to bring the data on a par with its importance and relevance for ecological research.