t-Distributed Stochastic Neighbor Embedding (t-SNE) and k-means have been increasingly utilized for dimension reduction and graphical illustration in medical imaging (e.g., CT) informatics. Mapping a grid network onto a slide is a prerequisite for implementing cluster analysis. Traditionally, the performance of cluster analysis is driven by hyperparameters, however, grid size which also affects performance is often set arbitrarily. In this study, we evaluated the effect of varying grid sizes, perplexity and learning rate hyperparameters for unsupervised clustering using CT images of renal masses. We investigated the impact of grid size to cluster analysis. The number of clusters was determined by Gap-statistics. The grid size selections were 2x2, 4x4, 5x5, and 8x8. The results showed that the number of output clusters increased with decreasing grid sizes from 8x8 to 4x4. However, when grid size reached 2x2, the model yielded the same cluster number as 8x8. This finding was consistent across different hyperparameter settings. Additional analyses were conducted to understand the nesting structure between the cluster membership (the mutually exclusive cluster number assigned to each grid in a cluster analysis) from large (8x8) grid and small (2x2) grid, although both grid size selections yielded the same number of clusters. We report that the cluster membership between large grid and small grid is only partially overlaid. This suggests that additional pattern/information is detected by using the small grid. In conclusion, the grid size should be treated as another hyperparameter when using unsupervised clustering methods for pattern recognition in medical imaging analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.