Cross-silo federated learning (FL) is a distributed learning approach where clients of the same interest train a global model cooperatively while keeping their local data private. The success of a cross-silo FL process requires active participation of many clients. Different from cross-device FL, clients in cross-silo FL are usually organizations or companies which may execute multiple cross-silo FL processes repeatedly due to their time-varying local data sets, and aim to optimize their long-term benefits by selfishly choosing their participation levels. While there has been some work on incentivizing clients to join FL, the analysis of clients' long-term selfish participation behaviors in cross-silo FL remains largely unexplored. In this paper, we analyze the selfish participation behaviors of heterogeneous clients in cross-silo FL. Specifically, we model clients' long-term selfish participation behaviors as an infinitely repeated game, with the stage game being a selfish participation game in one cross-silo FL process (SPFL). For the stage game SPFL, we derive the unique Nash equilibrium (NE), and propose a distributed algorithm for each client to calculate its equilibrium participation strategy. We show that at the NE, clients fall into at most three categories: (i) free riders who do not perform local model training, (ii) a unique partial contributor (if exists) who performs model training with part of its local data, and (iii) contributors who perform model training with all their local data. The existence of free riders has a detrimental effect on achieving a good global model and sustaining other clients' long-term participation. For the long-term interactions among clients, we derive a cooperative strategy for clients which minimizes the number of free riders while increasing the amount of local data for model training. We show that enforced by a punishment strategy, such a cooperative strategy is a subgame perfect Nash equilibrium (SPNE) of the infinitely repeated game, under which some clients who are free riders at the NE of the stage game choose to be (partial) contributors. We further propose an algorithm to calculate the optimal SPNE which minimizes the number of free riders while maximizing the amount of local data for model training. Simulation results show that our proposed cooperative strategy at the optimal SPNE can effectively reduce the number of free riders by up to 98.8% and increase the amount of local data for model training by up to 96%.