In the course of the development of Chinese soccer, the development of youth soccer plays a very important role, so it is very important to build a scientific training system for youth soccer. In this paper, Bootstrap resampling technique is used to train the multidimensional logistic regression algorithm, and the genetic algorithm is used to optimize the parameters, which in turn improves the logistic regression algorithm. The HBase discrete regression algorithm model is constructed by combining the improved logistic regression algorithm with the HBase database system. Experiments were conducted on the running ability, performance prediction, and mental health disorder detection of Chinese youth soccer players, and finally the clubs in five places that improved the training system of youth soccer players through the HBase discrete regression algorithm were evaluated. The experimental results show that the maximum instantaneous explosive power of the athletes is close to 80 m/s2. In the performance prediction experiment, the error between the performance of the youth soccer players predicted based on the HBase discrete regression algorithm and the real value is less than 0.5. In the five clubs that have improved the cultivation system of the youth soccer athletes based on the HBase discrete regression algorithm, the overall average score of the evaluation of the cultivation system is more than 83 points.. This study has the potential to help improve the development of youth soccer in China.