A standard method of publishing and sharing scientific data
A standard method for publishing different types of data generated from environmental observations has been developed. It allows researchers to publish data from many different sources so that they are freely available, can be easily understood, and can be integrated with observational data from other sources.
Scientific research is generating more information than ever before. Although technological advances allow greater sharing of results, the enormous variety of types of data makes it challenging to organise and integrate them into meaningful formats that are accessible to all users.
A Hydrologic Information System (HIS) has been developed by the researchers for publishing point observations (measurements made at a particular location, such as a weather station or stream gauge). The HIS components provide a framework that allows researchers working in different scientific domains to create and share multidisciplinary data within a common network.
To use the system, known as the CUAHSI HIS1, observational data are collected manually (for example, water quality sampling data) or from field sensors (for example, climate monitoring data). These data, which may come from many different sources and are recorded in different formats, are labelled with appropriate descriptions using terms that help users from other fields to interpret the data. A central registry informs the public of the availability of the data, which can be accessed via the internet. A dedicated web service enables communication between users and the databases, which can be accessed using a specialised search engine.
The system has been demonstrated within a national network of environmental observatory test beds. As part of the planning and development for a national network of large-scale environmental observatories, 11 observatory test bed projects have been established across the United States. Many types of data from a variety of scientific disciplines have been collected (e.g., groundwater levels and quality, and samples of nutrients and sediments from rivers and streams) and published using the CUAHSI HIS.
Using the system, the diverse test bed data are published as a national network of similar scientific research data. Each test bed maintains its own databases, and each decides which data to publish. For example, some make raw data available, while others publish only data that has passed through quality control checks. By June 2008, the environmental observatory test bed data network had established 31 databases covering 3767 monitoring sites and had published nearly 42 million point observations.
One advantage of this system is that it encourages and enables the publication of data that might otherwise be confined to the private files of individual investigators. In addition, the uniformity of the fully described data reduces errors in interpretation by users. Although set up to share data on water resources, the researchers suggest this framework can be applied to any other domain collecting point observations.