In the face of massive concurrent user access in the era of big data, how to build high-performance web services has become one of the difficulties to be solved by network applications. This paper utilizes Node.js architecture to fully utilize the event-based programming and non-blocking I/O characteristics of JavaScript language to improve the CPU utilization of the web service side. It also proposes the strategy of predicting user-accessed resources for cache replacement through a Gaussian mixture model to further optimize the performance of high-concurrency web services. The performance of the high concurrency web framework constructed using Node.js technology plays excellent, the performance of multi-threaded Node.js is better than single-threaded, and compared with Apache architecture, the average response time, request-response rate, and data throughput test of Node.js architecture in high concurrency scenarios is more advantageous than Apache architecture. The Gaussian mixture clustering model is effective in dealing with data with high-dimensional features, and the average accuracy of the four clustering processes of each algorithm is, in descending order, GMM (79.67%), BIRCH (78.09%), and K-means (77.52%). In addition, the cache replacement strategy based on Gaussian Mixture Model is more effective, with an accuracy rate of nearly 80% and a byte hit rate of nearly 45% when the cache capacity reaches 214KB, which are both higher than the traditional cache replacement strategy with the same cache capacity.