Transforming Large Databases into Critical Knowledge Using Data Mining - Three Case Studies in South Carolina and Georgia

0
ABSTRACT
Data mining is an emerging field that addresses the issue of converting large databases into knowledge. Data mining methods come from different technical fields such as signal processing, statistics, and artificial intelligence. Data mining employs methods for maximizing the information content of data, determining which variables have the strongest relationships to problems of interest, and developing models that predict future outcomes. Data mining is used extensively in financial services, banking, advertising, manufacturing, and e-commerce to classify the behaviors of organizations and individuals, and predict future outcomes. This paper describes the results of three case studies where data mining, including artificial neural network models, has been applied to large-scale environmental issues in South Carolina and Georgia. For the Beaufort River, South Carolina, dissolved-oxygen models were developed and used for determining Total Maximum Daily Load of allowable point-source effluent loading to the Beaufort River. For the Savannah River estuary, models were developed to simulated pore-water salinity and used to determine the potential impacts of deepening the Savannah Harbor on upstream freshwater tidal marshes. For the Pee Dee River in South Carolina, models were developed to determine the minimum streamflow required to protect municipal intakes from seawater inundation along the Grand Strand of South Carolina. In the three studies, the models were able to convincingly reproduce historical behaviors and generate alternative scenarios of interest. To make the results of the studies directly available to all stakeholders, user-friendly decision support systems were developed as a spreadsheet application that integrates the historical database, models, user controls, streaming graphics, and simulation output.

INTRODUCTION
While environmental monitoring technologies have made it cost effective to acquire tremendous amounts of real-time hydrologic and water-quality data, there is greater demand to transform these data into the essential knowledge needed by State and local water-resource managers. It is imperative that new technologies be developed and adopted that facilitate faster and more accurate data analysis, modeling, and regulatory tool development. Data mining is an emerging field that addresses the issue of converting large databases into knowledge to solve complex problems due to the large numbers of variables. Data mining methods come from different technical fields such as signal processing, statistics, and artificial intelligence and are used extensively in financial services, banking, advertising, manufacturing, and e-commerce to classify the behaviors. Data mining employs methods for maximizing the information content of data, determining which variables have the strongest relationships to problems of interest, and developing models that predict future outcomes. This knowledge encompasses both understanding of cause-effect relations and predicting the consequences of alternative actions.

There are many environmental systems where tremendous historical databases exist. Generally these databases are under interpreted and under utilized. Data mining offers an approach to transform these data into information and, ultimately, knowledge of functionality of the environmental systems. This paper describes the technical approach and results of three case studies where data mining was applied to large-scale environmental issues in South Carolina and Georgia to assist decision makers in the issuance of long-term permits. The three studies used existing databases for analysis and development of empirical models to understand how the system works and to address the salient concerns of decision makers. The large databases were transformed into information that provided new knowledge on how the systems function. The three case studies are (1) the determination of allowable point-source effluent loading to the Beaufort River for the design of a Regional Water Reclamation Facility; (2) the determination of the potential impacts of deepening the Savannah Harbor on upstream freshwater tidal marshes for the Environmental Impact Statement; and (3) the determination of minimum releases from North Carolina reservoirs required to protect municipal intakes from seawater inundation along the Grand Strand of South Carolina for a 50-year Federal Energy Regulatory Commission (FERC) license. The three case studies share several characteristics:

  • Utilized large existing historical databases;
  • Developed empirical models of complex tidal systems using Artificial Neural Network (ANN) models1;
  • Developed Decision Support Systems (DSS’s) that integrated databases, models, model simulation controls, streaming graphics, and model outputs in a easily disseminated spreadsheet application; and,
  • Results from the studies were or are currently (2006) being used for water-resource management with large long-term environmental, economic, and societal consequences.

Customer comments

No comments were found for Transforming Large Databases into Critical Knowledge Using Data Mining - Three Case Studies in South Carolina and Georgia. Be the first to comment!