Abstract. We present a general framework for the rigorous numerical analysis of time-fractional nonlinear parabolic partial differential equations, with a fractional derivative of order α ∈ (0, 1) in time. It relies on three technical tools: a fractional version of the discrete Grönwall-type inequality, discrete maximal regularity, and regularity theory of nonlinear equations. We establish a general criterion for showing the fractional discrete Grönwall inequality, and verify it for the L1 scheme and convolution quadrature generated by BDFs. Further, we provide a complete solution theory, e.g., existence, uniqueness and regularity, for a time-fractional diffusion equation with a Lipschitz nonlinear source term. Together with the known results of discrete maximal regularity, we derive pointwise L 2 (Ω) norm error estimates for semidiscrete Galerkin finite element solutions and fully discrete solutions, which are of order O(h 2 ) (up to a logarithmic factor) and O(τ α ), respectively, without any extra regularity assumption on the solution or compatibility condition on the problem data. The sharpness of the convergence rates is supported by the numerical experiments.Keywords: nonlinear fractional diffusion equation, discrete fractional Grönwall inequality, L1 scheme, convolution quadrature, error estimate 1. Introduction. Time-fractional parabolic partial differential equations (PDEs) have been very popular for modeling anomalously slow transport processes in the past two decades. These models are commonly referred to as fractional diffusion or subdiffusion. At a microscopic level, the underlying stochastic process is continuous time random walk [32]. So far they have been successfully applied in a broad range of diversified research areas, e.g., thermal diffusion in fractal domains [35], flow in highly heterogeneous aquifer [6] and single-molecular protein dynamics [20], just to name a few. Hence, the rigorous numerical analysis of such problems is of great practical importance. For the linear problem, various efficient time stepping schemes have been proposed, which include mainly two classes: L1 type schemes and convolution quadrature (CQ).L1 type schemes approximate the fractional derivative by replacing the integrand with its piecewise polynomial interpolation [24,26,37,3] and thus generalize the classical finite difference method. The piecewise linear case has a local truncation error O(τ 2−α ) for sufficiently smooth solution, where τ denotes the time step size. See also [31,33] for the discontinuous Galerkin method. CQ is a flexible framework introduced by Lubich [27,28] for constructing high-order time discretization methods for approximating fractional derivatives. It approximates the fractional derivative in the Laplace domain and automatically inherits the stability property of general linear multistep methods. See [10,39,40,16] for CQ type schemes. Optimal error estimates have been derived for both spatially semidiscrete and fully discrete schemes, including problems with nonsmooth data [10,14,31,16]....