Background Despite that propensity score methods have been widely adopted in observational studies, few studies have focused on how to properly estimate and incorporate propensity scores in complex survey data settings. Particularly, it is lacking that the research on propensity score-based weighting methods when estimating the treatment effect using complex survey data with binary outcomes. The objective of this study is therefore to compare three propensity score weighting methods for complex survey data when outcomes are binary. MethodsA simulation study was used to compare three propensity score weighting approaches for estimating treatment effects using survey data: 1) no survey weights in the propensity score model or the outcome model; 2) survey weights in the outcome model only; 3) survey weights in both models. Each of the three methods is applied in the context of four measures of the treatment effect: the sample average treatment effect (SATE), the population average treatment effect (PATE), the sample average treatment effect on the treated (SATT), and the population average treatment effect on the treated (PATT). The methods are compared in terms of mean relative bias and coverage probability under different scenarios by varying combinations of sample size and treatment effect, degrees of model misspecification and levels of overlap in propensity score. In addition, using the 2015 National Health Interview Survey (NHIS) data as a real data example, each method is employed to estimate the effect of provider-patient discussion about smoking on smoking cessation. ResultsThe methods which account for the survey weight outperform the unweighted method as the degree of misspecification increases, regardless of the sample size, treatment effect and level of overlap in propensity score. The performance of the two weighted methods, where the survey weight is incorporated in the outcome model or both the propensity score model and the outcome model, are similar. Conclusions The propensity score weighting methods accounting for the survey weight are necessary for estimating the population-level treatment effects. This paper provides guidance for selecting an appropriate propensity score weighting method based on the estimand of interest when using complex survey data with binary outcomes.