The purpose of this study was to examine the differences in sensitivity of three methods: IRT-Likelihood Ratio (IRT-LR), Mantel-Haenszel (MH) and Logistics Regression (LR), in detecting gender differential item functioning (DIF) on National Mathematics Examination (Ujian Nasional: UN) for 2014/2015 academic year in North Sumatera Province of Indonesia. DIF item shows the unfairness. It advantages the test takers of certain groups and disadvantages other group test takers, in the case they have the same ability. The presence of DIF was reviewed in grouping by gender: men as reference groups (R) and women as focus groups (F). This study used the experimental method, 3x1 design, with one factor (i.e. method) with three treatments, in the form of 3 different DIF detection methods. There are 5 types of UN Mathematics Year 2015 packages (codes: 1107, 2207, 3307, 4407 and 5507). The 2207 package code was taken as the sample data, consisting of 5000 participants (3067 women, 1933 men; for 40 UN items). Item selection was carried out based on the classical test theory (CTT) on 40 UN items, producing 32 items that fulfilled, and item response theory selection (IRT) produced 18 items that fulfilled. With program R 3.333 and IRTLRDIF 2.0, it was found 5 items were detected as DIF by the IRT-Likelihood Ratio-method (IRTLR), 4 items were detected as DIF by the Logistic Regression method (LR), and 3 items were detected as DIF by the MantelHaenszel method (MH). To test the sensitivity of the three methods, it is not enough with just one time DIF detection, but formed six groups of data analysis: (4400,40),(4400,32), (4400,18), (3000,40), (3000,32), (3000,18), and generate 40 random data sets (without repetitions) in each group, and conduct detecting DIF on the items in each data set. Although the data lacks model fit, the 3 parameter logistic model (3PL) is chosen as the most suitable model. With the Tukey's HSD post hoc test, the IRT-LR method is known to be more sensitive than the MH and LR methods in the group (4400,40) and (3000,40). The IRT-LR method is not longer more sensitive than LR in the group (4400,32) and (3000,32), but still more sensitive than MH. In the groups (4400,18) and (3000,18) the IRT-LR method is more sensitive than LR, but not significantly more sensitive than MH. The LR method is consistently tested to be more sensitive than the MH method in the entire analysis groups.