The growth and popularity of streaming music have changed the way people consume music, and users can listen to online music anytime and anywhere. By integrating various recommendation algorithms/strategies (user profiling, collaborative filtering, content filtering, etc.), we capture users’ interests and preferences and recommend the content of interest to them. To address the sparsity of behavioral data in digital music marketing, which leads to inadequate mining of user music preference features, a metric ranking learning recommendation algorithm with fused content representation is proposed. Relative partial order relations are constructed using observed and unobserved behavioral data to enable the model to be fully trained, while audio feature extraction submodels related to the recommendation task are constructed to further alleviate the data sparsity problem, and finally, the preference relationships between users and songs are mined through metric learning. Convolutional neural networks are used to extract the high-level semantic features of songs, and then the high-level semantic features of songs extracted from the previous layer are reformed into a session time sequence list according to the time sequence of user listening in order to build a bidirectional recurrent neural network model based on the attention mechanism so that it can reduce the influence of noisy data and learn the strong dependencies between songs.