The sampling frequency and quantity of time series data collected from water distribution systems has been increasing in recent years, giving rise to the potential for improving system knowledge if suitable automated techniques can be applied, in particular, machine learning. Novelty (or anomaly) detection refers to the automatic identification of novel or abnormal patterns embedded in large amounts of “normal” data. When dealing with time series data (transformed into vectors), this means abnormal events embedded amongst many normal time series points. The support vector machine is a data-driven statistical technique that has been developed as a tool for classification and regression. The key features include statistical robustness with respect to non-Gaussian errors and outliers, the selection of the decision boundary in a principled way, and the introduction of nonlinearity in the feature space without explicitly requiring a nonlinear algorithm by means of kernel functions. In this research, support vector regression is used as a learning method for anomaly detection from water flow and pressure time series data. No use is made of past event histories collected through other information sources. The support vector regression methodology, whose robustness derives from the training error function, is applied to a case study.
Keywords: data analysis, leakage, novelty detection, support vector machines, water distribution systems