Many animal groups are heterogeneous and may even consist of individuals of different species, called mixed-species flocks. Mathematical and computational models of collective animal movement behavior, however, typically assume that groups and populations consist of identical individuals. In this paper, using the mathematical framework of the coagulation-fragmentation process, we develop and analyze a model of merge and split group dynamics, also called fission-fusion dynamics, for heterogeneous populations that contain two types (or species) of individuals. We assume that more heterogeneous groups experience higher split rates than homogeneous groups, forming two daughter groups whose compositions are drawn uniformly from all possible partitions. We analytically derive a master equation for group size and compositions and find mean-field steady-state solutions. We predict that there is a critical group size below which groups are more likely to be homogeneous and contain the abundant type/species. Despite the propensity of heterogeneous groups to split at higher rates, we find that groups are more likely to be heterogeneous but only above the critical group size. Monte-Carlo simulation of the model show excellent agreement with these analytical model results. Thus, our model makes a testable prediction that composition of flocks are group-size dependent and do not merely reflect the population level heterogeneity. We discuss the implications of our results to empirical studies on flocking systems.