This paper extends previous research to analytically identify leaks within a water distribution network (WDN), by combining hydraulic simulation and network science based data analysis techniques. The WDN model is used to run several ‘leakage scenarios’, by varying leak location (pipe) and severity, and to build a dataset with corresponding variations in pressure and flow, induced by the leak. All junctions and pipes are considered for potential pressure and flow sensors deployment; a clustering procedure on these locations identifies the most relevant nodes and pipes, and cost-effectiveness was considered. A graph is then generated from the dataset, having scenarios as nodes and edges weighted by the similarity between each pair of nodes (scenarios), in terms of pressure and flow variation due to the leak. Spectral clustering groups together similar scenarios in the eigen-space spanned by the most relevant eigen-vectors of the Laplacian matrix of the graph. This method uses superior traditional techniques. Finally, support vector machines classification learning is used to learn the relation between variations in pressure and flow at the deployed meters and the most probable set of pipes affected by the leak.