EarthFX - Earth Science Data Model
A Data Model is essentially a network of data tables developed to both efficiently store data as well as enforce a number of rules to ensure data integrity and consistency. The evolution of the EarthFX data model has been driven by the need of hydrogeologists to store large amounts of groundwater-related data and the need to rapidly retrieve and analyze these data. Through several iterations (beginning in 1991), we have made an effort to balance database theory (normalization) and user preferences (spreadsheets) to foster usability yet accommodate larger volumes of data. The EarthFX Data Model is currently being used to manage environmental data for a wide range of studies including a nuclear reactor facility, several regional groundwater studies, military bases, and a contaminated fractured rock site. The data model also works seamlessly with SiteFX to ensure ease of use, controlled data entry and rapid data extraction.
The evolution of the EarthFX data model has been driven by the need of hydrogeologists to store large amounts of groundwater-related data and the need to rapidly retrieve and analyze these data. Through several iterations, we have made an effort to balance database theory (normalization) and user preferences (spreadsheets) to foster usability yet accommodate larger volumes of data while using JET/ACCESS as the database engine and interface.
The first version of the data model, developed for a landfill in 1991, was biased towards simple groundwater monitoring applications, and was built on two key tables: BOREHOLE, which included details of the position, name, depth and date of the well, and SCREEN, which was added as an afterthought to accommodate boreholes with multiple screens. Time-series data was separated into various other tables, and the data model did not include reference or look-up tables.
This was undoubtedly a simple database, with eight tables, all of which looked like spreadsheets. Look-Up (or reference) tables were not used. This meant that all of the information in the tables was stored as the text strings, rather than codes. On the upside, novice users were quick to learn how to use the database; whereas on the downside, it was not scalable, and simply would not function under the load of the current applications, such as watershed or regional databases.
The data model underwent significant changes with its application to the Smithville PCB Site in 1997. The Smithville site is a small property of several acres located south of Grimsby, Ontario, just outside the village of Smithville. Used transformers were stored and partially recycled on the property. Consequently, PCB waste oil entered the fractured-rock groundwater regime. The Ontario Ministry of the Environment took over the site and managed several years of investigative work to further define groundwater contamination characteristics. This lead to massive amount of time-series data, extracted from a variety of inclined monitoring wells completed with WestBay equipment.
This project led to scalability within the data model. Nested wells were no longer vertical, and were tested with packers or WestBay equipment. This required greater flexibility in the database fields in order to store the physical configuration of the wells. Frequent water-level measurements, pumping tests and water-quality sampling led to a re-evaluation of the time-series data tables, and the creation of the D_INTERVAL series of tables in the current EarthFX data model. Look-Up tables were introduced to encode repetitive information such as water-quality parameter names. These look-ups enforced strict QA/QC on the families of volatile organic compound parameters (often with multiple, yet similar names) being tested.
In 1999, EarthFX was retained by a petroleum-engineering firm to support their data management and analysis needs for oil-well exploration data. This led to an important maturing of the data model as EarthFX gained experience with the global-scale petroleum data models of POSC (Petrotechnical Open Software Corporation) and PPDM (Public Petroleum Data Model). These data models are immense, and include fields and tables for all tabular data collected from all aspects of the petroleum business cycle. This exposure provided EarthFX with a ‘final vision’ of a comprehensive groundwater data model. This vision provides guidance in the development of the data model process and reinforced the notion that for large applications, data models are complex.
The final step in the evolution of the EarthFX data model was its application at Ontario Power Generation’s (OPG) Pickering Nuclear Generation Station to manage their environmental monitoring program. OPG was collecting monitoring data from wells, surface-water stations, sumps, sewers, trenches, manholes, weather stations and rabbit holes and thus required a data management system to consolidate the data that was collected. This expanded the EarthFX data model from being largely ‘well based’, to being ‘hydraulic based’, a process achieved by the incorporation of more generic name conventions (such as Location and Interval rather then Borehole and Screen). In addition, this led to the removal of well-specific fields, which were replaced by generic fields applicable to groundwater, surface water and meteorological data sets.