S U M M A R YThis paper applies a Bayesian approach to examine non-linearity for the 1-D magnetotelluric (MT) inverse problem. In a Bayesian formulation the posterior probability density (PPD), which combines data and prior information, is interpreted in terms of parameter estimates and uncertainties, which requires optimizing and integrating the PPD. Much work on 1-D MT inversion has been based on (approximate) linearized solutions, but more recently fully non-linear (numerical) approaches have been applied. This paper directly compares results of linearized and non-linear uncertainty estimation for 1-D MT inversion; to do so, advanced methods for both approaches are applied. In the non-linear formulation used here, numerical optimization is carried out using an adaptive-hybrid algorithm. Numerical integration applies Metropolis-Hastings sampling, rotated to a principal-component parameter space for efficient sampling of correlated parameters, and employing non-unity sampling temperatures to ensure global sampling. Since appropriate model parametrizations are generally not known a priori, both under-and overparametrized approaches are considered. For underparametrization, the Bayesian information criterion is applied to determine the number of layers consistent with the resolving power of the data. For overparametrization, prior information is included which favours simple structure in a manner similar to regularized inversion. The data variance and/or trade-off parameter regulating data and prior information are treated in several ways, including applying fixed optimal estimates (an empirical Bayesian approach) or including them as hyperparameters in the sampling (hierarchical Bayesian). The latter approach has the benefit of accounting for the uncertainty in the hyperparameters in estimating model parameter uncertainties. Non-linear and linearized inversion results are compared for synthetic test cases and for the measured COPROD1 MT data by considering marginal probability distributions and marginal profiles. In some cases, important differences are indicated, including poorer sensitivity to thin and/or low-conductivity layers for linearized inversion, and multimodal PPDs which cannot be addressed within a linearized approach.