Malaria is one of the major global health threats. Microscopy examination is the ”gold standard” for malaria diagnosis designated by the World Health Organization. However, microscopy heavily relies on doctors’ expertise, leading to prolonged diagnosis times, reduced efficiency, and the risk of misdiagnosis. This paper introduces a multi-level attention split network (MAS-Net) that addresses the challenges of information loss in detecting small targets and the discrepancy between the detection head’s receptive field and target size. Additionally, we introduce a split contextual attention structure (SPCot) that incorporates contextual information and builds a novel feature extraction network around it. SPCot avoids excessive feature map channel compression, fully utilizes contextual information, reduces information loss, and improves detection performance. Furthermore, we propose the inclusion of a multi-scale receptive field detection head (MRFH) in the shallow detection layer. This module effectively adapts to targets of varying scales, enhances the receptive fields, and improves the detection performance of malaria cells. The model achieves a significant average accuracy of 75.9% on the publicly available malaria dataset.