For a given test criterion, the number of test-sets satisfying the criterion may be very large, with varying fault detection effectiveness. In recent work [29], the measure of variation in effectiveness of test criterion was defined as 'tolerance'. This paper presents an experimental evaluation of tolerance for control-flow test criteria. The experimental analysis is done by exhaustive test-set generation, wherever possible, for a given criteria which improves on earlier empirical studies that adopted analysis of some test-sets using random selection techniques. Four industrially used control-flow testing criteria, Condition Coverage (CC), Decision Condition Coverage (DCC), Full Predicate Coverage (FPC) and Modified Condition/Decision Coverage (MC/DC) have been analysed against four types of faults. A new test criteria, Reinforced Condition/Decision Coverage (RC/DC) [28], is also analysed and compared. The Boolean specifications considered were taken from a past research paper and also generated randomly. To ensure that it is the test-set property that influences the effectiveness and not the test-set size, the average test-set size was kept same for all the test criteria except RC/DC. A further analysis of variation in average effectiveness with respect to number of conditions in the decision was also undertaken. The empirical results show that the MC/DC criterion is more reliable and stable when compared to the other considered criteria. Though the number of test-cases is large in RC/DC testsets, no significant improvement in effectiveness and tolerance was observed over MC/DC.