Acoustic echo cancellation (AEC) and suppression (AES) are widely researched topics. However, only few papers about hybrid or deep acoustic echo control provide a solid comparative analysis of their methods as it was common with classical signal processing approaches. There can be distinct differences in the behaviour of an AEC/AES model which cannot be fully represented by a single metric or test condition, especially when comparing classical signal processing and machine-learned approaches. These characteristics include convergence behaviour, reliability under varying speech levels or far-end signal types, as well as robustness to adverse conditions such as harsh nonlinearities, room impulse response switches or continuous changes, or delayed echo. A first contribution of this article is to present an extended set of test conditions and metrics that yields a proper characterization of an AEC/AES model and provides researchers with a useful toolbox to benchmark their systems. Second, we evaluate multiple AEC/AES models, each representing a classical, machine-learned, or hybrid paradigm, in various test conditions. We provide an analysis and new insights into their strengths and weaknesses and identify limitations of common metrics in some cases. Our entire toolbox of evaluation metrics and testing conditions is available on GitHub 1 .