The number of malicious websites is increasing yearly, and many companies and individuals worldwide have suffered losses. Therefore, the detection of malicious websites is a task that needs continuous development. In this study, a joint neural network algorithm model combining the attention mechanism, bidirectional independent recurrent neural network (Bi-IndRNN), and capsule network (CapsNet) is proposed. The word vector tool word2vec trains the character- and word-level uniform resource locator (URL) static embedding vector features. At the same time, the algorithm will also extract texture fingerprint features that can compare the content differences of different malicious web URL binary files. Then, the extracted features are fused and input into the joint neural network algorithm model. First, the multihead attention mechanism is used to extract contextual semantic features by adjusting weights and Bi-IndRNN. Second, CapsNet with dynamic routing is used to extract deep semantic information. Finally, the sigmoid classifier is used for classification. This study uses different methods from different angles to extract more comprehensive features. From the experimental results, the method proposed in this study improves the classification accuracy of malicious web page detection compared with other researchers.