This paper addresses the difficult question of how to perform meaningful comparisons between neural network-based hydrological models and alternative modelling approaches. Standard, goodness-of-fit metric approaches are limited since they only assess numerical performance and not physical legitimacy of the means by which output is achieved. Consequently, the potential for general application or catchment transfer of such models is seldom understood. This paper presents a partial derivative, relative sensitivity analysis method as a consistent means by which the physical legitimacy of models can be evaluated. It is used to compare the behaviour and physical rationality of a generalised linear model and two neural network models for predicting median flood magnitude in rural catchments. The different models perform similarly in terms of goodness-of-fit statistics, but behave quite distinctly when the relative sensitivities of their inputs are evaluated. The neural solutions are seen to offer an encouraging degree of physical legitimacy in their behaviour, over that of a generalised linear modelling counterpart, particularly when overfitting is constrained. This indicates that neural models offer preferable solutions for transfer into ungauged catchments. Thus, the importance of understanding both model performance and physical legitimacy when comparing neural models with alternative modelling approaches is demonstrated.