The number of automated measuring and reporting systems used in water distribution and sewer systems is dramatically increasing and, as a consequence, so is the volume of data acquired. Since real-time data is likely to contain a certain amount of anomalous values and data acquisition equipment is not perfect, it is essential to equip the SCADA (Supervisory Control and Data Acquisition) system with automatic procedures that can detect the related problems and assist the user in monitoring and managing the incoming data. A number of different anomaly detection techniques and methods exist and can be used with varying success. To improve the performance, these methods must be fine tuned according to crucial aspects of the process monitored and the contexts in which the data are classified. The aim of this paper is to explore if the data context classification and pre-processing techniques can be used to improve the anomaly detection methods, especially in fully automated systems. The methodology developed is tested on sets of real-life data, using different standard and experimental anomaly detection procedures including statistical, model-based and data-mining approaches. The results obtained clearly demonstrate the effectiveness of the suggested anomaly detection methodology.
Keywords: anomaly detection, context-classification-based detection, data pre-processing, sewer data