Abstract. We present and analyze a hybrid technique to numerically solve strongly monotone nonlinear problems by the discontinuous Petrov-Galerkin method with optimal test functions (DPG). Our strategy is to relax the nonlinear problem to a linear one with additional unknown and to consider the nonlinear relation as a constraint. We propose to use optimal test functions only for the linear problem and to enforce the nonlinear constraint by penalization. In fact, our scheme can be seen as a minimum residual method with nonlinear penalty term. We develop an abstract framework of the relaxed DPG scheme and prove under appropriate assumptions the well-posedness of the continuous formulation and the quasi-optimal convergence of its discretization. As an application we consider an advection-diffusion problem with nonlinear diffusion of strongly monotone type. Some numerical results in the lowest-order setting are presented to illustrate the predicted convergence.Key words. Discontinuous Petrov-Galerkin method, optimal test functions, strongly monotone operator, advectiondiffusion, nonlinear penalty.AMS subject classifications. 65N30, 65J15, 65N12, 47H051. Introduction. In recent years, the discontinuous Petrov-Galerkin method with optimal test functions ("DPG method" in the following) has proved to be an attractive strategy to produce infsup stable approximations for a wide class of problems. The basic setting stems from Demkowicz and Gopalakrishnan [14,13] and has been extended, e.g., to linear elasticity [1,18], the Stokes and Maxwell equations [28,7], the Schrödinger equation [15], boundary integral and fractional equations [24,17]. Another promising application area is singularly perturbed problems [16,9,3,4,25].All the cited references, however, deal with linear problems. An extension of the DPG technology to nonlinear problems, on the other hand, is a delicate issue. Principal problem is that the calculation (or approximation) of optimal test functions involves an application of the underlying operator (the DPG method is a minimum residual method). For nonlinear problems this step thus becomes nonlinear, i.e., expensive. One way to circumvent the nonlinearity is, of course, to linearize the underlying problem. This has been the approach in [8,29]. A different idea is to apply the minimum residual technique in product or "broken" spaces to the nonlinear problem. Bui-Thanh and Ghattas [5] did this by considering the entire nonlinear problem as a constraint, and Carstensen et al.[6] developed a representation of the DPG scheme by a nonlinear mixed form and analyzed the case of lowest order approximations. DPG for contact problems has been studied in [21], though in this case the nonlinearity is due to the contact condition which is treated by a variational inequality. We also note that Muga and van der Zee [27] study problems posed in Banach spaces. In those cases the calculation of optimal test functions becomes nonlinear even though the underlying PDE is linear.In this paper we propose a combined scheme that empl...