The log-rank test is most powerful under proportional hazards (PH). In practice, non-PH patterns are often observed in clinical trials, such as in immuno-oncology; therefore, alternative methods are needed to restore the efficiency of statistical testing. Three categories of testing methods were evaluated, including weighted log-rank tests, Kaplan-Meier curve-based tests (including weighted Kaplan-Meier and Restricted Mean Survival Time, RMST), and combination tests (including Breslow test, Lee's combo test, and MaxCombo test). Nine scenarios representing the PH and various non-PH patterns were simulated. The power, type-I error, and effect estimates of each method were compared. In general, all tests control type I error well. There is not a single most powerful test across all scenarios. In the absence of prior knowledge regarding the PH or non-PH patterns, the MaxCombo test is relatively robust across patterns. Since the treatment effect changes overtime under non-PH, the overall profile of the treatment effect may not be represented comprehensively based on a single measure. Thus, multiple measures of the treatment effect should be pre-specified as sensitivity analyses to evaluate the totality of the data.
It is often necessary to compare two measurement methods in medicine and other experimental sciences. This problem covers a broad range of data. Many authors have explored ways of assessing the agreement of two sets of measurements. However, there has been relatively little attention to the problem of determining sample size for designing an agreement study. In this paper, a method using the interval approach for concordance is proposed to calculate sample size in conducting an agreement study. The philosophy behind this is that the concordance is satisfied when no more than the pre-specified k discordances are found for a reasonable large sample size n since it is much easier to define a discordance pair. The goal here is to find such a reasonable large sample size n. The sample size calculation is based on two rates: the discordance rate and tolerance probability, which in turn can be used to quantify an agreement study. The proposed approach is demonstrated through a real data set.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.