Rare class imbalance problems, which involve the classification of minority or rare class, are difficult, because the size of the rare class is smaller than the majority class. Since majority class prediction is easy, its accuracy seems to be also high. However, the minority classes cannot be accurately predicted, and for this reason, when the prediction model performance is evaluated by considering only the accuracy, it does not indicate whether the model can predict the minority classes. Therefore, a rare class prediction technique is required. In this study, a rare class prediction model is proposed for minority class prediction. In addition, a dataset of a semiconductor manufacturing process with class imbalance problems was used to create a fault detection model. This prediction model uses data preprocessing to build the characteristics and data set required by the rare classes. To distinguish the rare classes related to the required characteristics, we used standard deviation and Euclidean distance to perform the feature selection. In addition, a particle swarm optimization-deep belief network was applied to create a classifier. The model proposed in this research presents outstanding performance and is appropriate for highly class imbalance problems. KEYWORDSclass imbalance problem, deep belief network, feature selection, particle swarm optimization, rare class classification INTRODUCTIONBecause of the issues with dig data and the development of deep learning techniques, the methods for building prediction models are in the spotlight. 1,2 Many AI-based prediction models, which use machine learning, data mining, databases, and statistical methods, are being proposed. Such prediction models based on state-of-the-art techniques are being applied in many fields, and there is a progressive increase in their industrial value. 3,4 For us to implement the prediction models accurately, it is necessary to analyze both domain knowledge and data. In addition, there is an increase in demand for obtaining useful knowledge from the collected data, and therefore, active research is being conducted on prediction models that are suitable for specific domains. 5,6 Thus, the importance of classification prediction techniques for class imbalance problems including class distribution, which is 1 of the main issues in the field of data mining, is increasing. 7-9 When the classes are balanced (balanced class), the ratios of the classes to be predicted are evenly distributed.Thus, by learning the data, a balanced predictive model that can predict all the classes can be generated. In the imbalance problem, the ratio of the category to be predicted is different. In this case, a classification prediction model that can predict only a specific class (rare class or majority class) is generated. For example, in the semiconductor manufacturing process, although most of the produced wafers are regular products, there is small probability for the production of irregular products. Therefore, a rare class prediction method is required to pre...
Intermittent drainage can reduce methane (CH4) emission from rice paddy soils, but nitrous oxide (N2O) emission can increase. We believe that the slow released N fertilizer can mitigate N2O emissions by reducing N lost to the environment. In this study, we tried to assess the influence of slow N fertilizer on effective greenhouse gas (GHG) reduction. We installed three different treatments, urea (U) treatment, controlled release fertilizer (CRF) treatment, and hairy vetch with urea (HV) treatment. The emission rates of CH4 and N2O were monitored using the closed chamber method during cropping and fallow season. The grain yield was investigated to calculate yield scaled greenhouse gas intensity (GHGI). Compared with U treatment, CH4 emission was reduced in CRF but increased in HV treatment. In contrast, N2O emission was increased in CRF but reduced in HV treatment. Grain yield was increased in CRF and HV treatment than U treatment. The GHGI was the lowest in CRF treatment by high grain yield and low GHG emission. In contrast, GHGI was the highest in HV treatment due to increased CH4 emission. In conclusion, controlled release fertilizer can effectively reduce GHG emission. However, CRF application increased N2O emissions during the fallow season, and further investigation is needed to determine whether this is due to the effect of fertilizer residues. In addition, due to field experiments that are easy to influenced by the environmental condition, it seems necessary to verify the research results through additional investigations over many years.
Object Pathologic prediction of prostate cancer can be made by predicting the patient's prostate metastasis prior to surgery based on biopsy information. Because biopsy variables associated with pathology have uncertainty regarding individual patient differences, a method for classification according to these variables is needed. Method We propose a deep belief network and Dempster-Shafer- (DBN-DS-) based multiclassifier for the pathologic prediction of prostate cancer. The DBN-DS learns prostate-specific antigen (PSA), Gleason score, and clinical T stage variable information using three DBNs. Uncertainty regarding the predicted output was removed from the DBN and combined with information from DS to make a correct decision. Result The new method was validated on pathology data from 6342 patients with prostate cancer. The pathology stages consisted of organ-confined disease (OCD; 3892 patients) and non-organ-confined disease (NOCD; 2453 patients). The results showed that the accuracy of the proposed DBN-DS was 81.27%, which is higher than the 64.14% of the Partin table. Conclusion The proposed DBN-DS is more effective than other methods in predicting pathology stage. The performance is high because of the linear combination using the results of pathology-related features. The proposed method may be effective in decision support for prostate cancer treatment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.