Infants are vulnerable to several health problems and cannot express their needs clearly. Whenever they are in a state of urgency and require immediate attention, they cry, which is a form of communication for them. Therefore, the parents of the infants always need to be alert and keep continuous supervision of their infants. However, parents cannot monitor their infants all the time. An infant monitoring system could be a possible solution to monitor the infants, determine when the infants are crying, and notify the parents immediately. Although many such systems are available, most cannot detect infant cries. Some systems have infant cry detection mechanisms, but those mechanisms are not very accurate in detecting infant cries because the mechanisms either include obsolete approaches or machine learning (ML) models that cannot identify infant cries from noisy household settings. To address this limitation, in this research, different conventional and hybrid ML models were developed and analyzed in detail to find out the best model for detecting infant cries in a household setting. A stacked classifier is proposed using different state‐of‐the‐art technologies, outperforming all other developed models. The proposed CNN‐SCNet's (CNN‐Stacked Classifier Network) precision, recall, and f1‐score were found to be 98.72%, 98.05%, and 98.39%, respectively. Infant monitoring systems can use this classifier to detect infant cries in noisy household settings.