The automatic detection of bone marrow (BM) cell diseases plays a vital role in the medical field; it helps to make diagnoses more precise and effective, which leads to early detection and can significantly improve patient outcomes and increase the chances of successful intervention. This study proposed a fully automated intelligent system for BM classification by developing and enhancing Capsule Neural Network (CapsNet) architecture. Although CapsNet has demonstrated success in many classification tasks, it still has some limitations and challenges associated with using Convolutional Neural Networks (CNNs), which suffer from information loss during the pooling and discarding of detailed spatial information, resulting in the loss of fine‐grained features. Additionally, CNNs must help capture hierarchical feature relationships and often learn them implicitly by stacking convolutional layers. In contrast, CapsNets are designed to capture hierarchical features through dynamic routing and relationships between capsules, resulting in a more explicit representation of spatial hierarchy. CapsNets manage transformations and offer equivariance, preserving spatial information through capsule routing mechanisms. Further, to improve how features are represented, pre‐trained models such as Residual Capsule Network (RES‐CapsNet), Visual Geometry Group Capsule Network (VGG‐CapsNet), and Google Network (Inception V3) (GN‐CapsNet) have been used. This helps the network obtain the low‐ and mid‐level features and information it has previously learned so that subsequent capsule layers receive better initial information. Additionally, the Synthetic Minority Over‐Sampling Technique (SMOTE) was implemented to mitigate class imbalance. It generates synthetic samples in feature space by over‐sampling the minority class, leading to improving model performance in accurately classifying rare instances. Fine‐tuning the hyperparameters and implementing these improvements resulted in remarkable accuracy rates on a large BM dataset, with reduced training time and trainable parameters. CapsNet achieved 96.99%, VGG‐CapsNet achieved 98.95%, RES‐CapsNet achieved 99.24%, and the GN‐CapsNet model demonstrated superior accuracy at 99.45%. GN‐Caps Net was the best because it requires a small number of epochs and has an effective deep inception architecture that efficiently extracts features at different scales to form a robust representation of the input. Our proposed models were compared with existing state‐of‐the‐art models using the BM dataset; the results showed that our models outperformed the existing approaches and demonstrated excellent performance. Further, this automated system can analyze large amounts of data and complex cells in images of the BM dataset. Thus, it gives healthcare professionals a detailed understanding of different diseases, which may take time to achieve manually.