Although using single-instance learning methods to solve multi-instance problems has achieved excellent performance in many tasks, the reasons for this success still lack a rigorous theoretical explanation. In particular, the potential relation between the number of causal factors (also called causal instances) in a bag and the model performance is not transparent. The goal of our study is to use the causal relationship between instances and bags to enhance the interpretability of multi-instance learning. First, we provide a lower bound on the number of instances required to determine causal factors in a real multi-instance learning task. Then, we provide a lower bound on the single-instance learning loss function when testing instances and training instances follow the same distribution and extend this conclusion to the situation where the distribution changes. Thus, theoretically, we demonstrate that the number of causal factors in the bag is an important parameter that affects the performance of the model when using single-instance learning methods to solve multi-instance learning problems. Finally, combining with a specific classification task, we experimentally validate our theoretical analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.