BackgroundAppraisal of Guidelines for Research and Evaluation (AGREE) II instrument have been widely used by scholars around the world to assess the methodological quality of clinical practice guidelines (CPGs). We sought to identify items or domains that are commonly scored low in the assessment, and to systematically review the issues that emerged when evaluators used the AGREE II tool for guideline quality assessment.MethodsA systematic search was conducted to identify articles published in medically relevant databases from 2022 to 2023 regarding the use of the AGREE II tool for the assessment of CPGs. We extracted six quality domains and overall quality assessment data of CPGs included in the literature, and processed the data using descriptive statistical analysis, difference analysis, regression analysis, and correlation analysis. A seven‐point Likert scale was used to assess the reporting quality of the included articles.Results151 relevant publications were identified, including 2081 guidelines published between 1990 and 2022. The results of the regression analysis showed a statistically significant impact of all domains on overall guideline quality (p < 0.001; R2 = 0.777). Domain 1, 2, 3, 4, and 6 scores differed significantly over time (p < 0.001) and were increasing. The score was good for Domain 4 (median 78.00 [IQR: 62.75–89.00]; mean 74.34 [SD 18.85]) and Domain 1 (median 78.00 [IQR: 61.00–90.00]; mean 73.57 [SD 21.12]). Scores were generic for Domain 6 (median 58.33 [IQR: 25.00–83.33]; mean 53.98 [SD 34.13]), Domain 2 (median 53.00 [IQR: 33.30–72.10]; mean 53.30 [SD 24.52]) and Domain 3 (median 51.00 [IQR: 26.02–73.00]; mean 50.44 [SD 27.19]). The score was poor for Domain 5 (median 36.20 [IQR: 20.20–58.32]; mean 40.21 [SD 24.90]). In addition, the quality evaluation results of the included articles showed that 33.1% were evaluated as low and 11.9% as very low.ConclusionsAGREE II tools have facilitated the development of methodological quality for CPGs. Although the quality of CPGs has improved over time, some general low‐quality problems still exist, and solving these problems will be an effective way for developers to upgrade the quality of guidelines. In addition, addressing critical issues in the evaluation of guidelines to present high‐quality study reports would be another way to guide guideline development.