With the rapid development of information age, various social groups and corresponding institutions are producing a large amount of information data every day. For such huge data storage and identification, in order to manage such data more efficiently and reasonably, traditional semantic similarity algorithm emerges. However, the accuracy of the traditional semantic similarity algorithm is relatively low, and the convergence of corresponding algorithm is poor. Based on this problem, this paper starts with the conceptual structure of language, analyzes the depth of language structure and the distance between nodes, and analyzes the two levels as the starting point. For the information of a specific data resource description frame type, the weight of interconnected edges is used for impact analysis so as to realize the semantic similarity impact analysis of all information data. Based on the above improvements, this paper also systematically establishes the data information modeling process based on language conceptual structure and establishes the corresponding model. In the experimental part, the improved algorithm is simulated and analyzed. The simulation results show that compared with the traditional algorithm, the algorithm has obvious accuracy improvement.