Objective: The objective of this work is to build a prediction model for Operating Room Time (ORT) to be used in an intelligent scheduling system. This prediction is a complex exercise due to its high variability and multiple influential variables. Materials and methods: We assessed a new strategy using Latent Class Analysis (LCA) and clustering methods to identify subgroups of procedures and surgeries that are combined with prediction models to improve ORT estimates. Three tree-based models are assessed, Classification and Regression Trees (CART), Conditional Random Forest (CFOREST) and Gradient Boosting Machine (GBM), under two scenarios: (i) basic dataset of predictors and (ii) complete dataset with binary procedures. To evaluate the model, we use a test dataset and a training dataset to tune parameters. Results and discussion: The best results are obtained with GBM model using the complete dataset and the grouping variables, with an operational accuracy of 57.3% in the test set. Conclusion: The results indicate the GBM model outperforms other models and it improves with the inclusion of the procedures as binary variables and the addition of the grouping variables obtained with LCA and hierarchical clustering that perform the identification of homogeneous groups of procedures and surgeries.
In the inferential process of Principal Component Analysis (PCA), one of the main challenges for researchers is establishing the correct number of components to represent the sample. For that purpose, heuristic and statistical strategies have been proposed. One statistical approach consists in testing the hypothesis of the equality of the smallest eigenvalues in the covariance or correlation matrix using a Likelihood-Ratio Test (LRT) that follows a χ2 limit distribution. Different correction factors have been proposed to improve the approximation of the sampling distribution of the statistic. We use simulation to study the significance level and power of the test under the use of these different factors and analyze the sample size required for an dequate approximation. The results indicate that for covariance matrix, the factor proposed by Bartlett offers the best balance between the objectives of low probability of Type I Error and high Power.
If the correlation matrix is used, the factors W ∗
and cχ2
are the most
recommended. Empirically, we can observe that most factors require sample sizes 10 or 20 times the number of variables if covariance or correlationmatrices, respectively, are implemented.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.