Matrix factorization is a very powerful tool to find graph patterns, e.g. communities, anomalies, etc. A recent trend is to improve the usability of the discovered graph patterns, by encoding some interpretation-friendly properties (e.g., non-negativity, sparseness, etc) in the factorization. Most, if not all, of these methods are tailored for the task of community detection.We propose NrMF, a non-negative residual matrix factorization framework, aiming to improve the interpretation for graph anomaly detection. We present two optimization formations and their corresponding optimization solutions. Our method can naturally capture abnormal behaviors on graphs. We further generalize it to admit sparse constrains in the residual matrix. The effectiveness and efficiency of the proposed algorithms are analyzed, showing that our algorithm (i) leads to a local optima; and (ii) scales to large graphs. The experimental results on several data sets validate its effectiveness as well as efficiency.