This study examines the spatial structure of children with cleft lip and palate (CLP) and its association with polluted areas in the Monterrey Metropolitan Area (MMA). The Nearest Neighbor Index (NNI) and the Spatial Statistical Scan (SaTScan) determined that the CLP cases are agglomerated in spatial clusters distributed in different areas of the city, some of them grouping up to 12 cases of CLP in a radius of 1.2 km. The application of the interpolation by empirical Bayesian kriging (EBK) and the inverse distance weighted (IDW) method showed that 95% of the cases have a spatial interaction with values of particulate matter (PM10) of more than 50 points. The study also shows that 83% of the cases interacted with around 2000 annual tons of greenhouse gases. This study may contribute to other investigations applying techniques for the identification of environmental and genetic factors possibly associated with congenital malformations and for determining the influence of contaminating substances in the incidence of these diseases, particularly CLP.
Background
Diffuse large B-cell lymphoma (DLBCL) is classified into germinal center-like (GCB) and non-germinal center-like (non-GCB) cell-of-origin groups, entities driven by different oncogenic pathways with different clinical outcomes. DLBCL classification by immunohistochemistry (IHC)-based decision tree algorithms is a simpler reported technique than gene expression profiling (GEP). There is a significant discrepancy between IHC-decision tree algorithms when they are compared to GEP.
Methods
To address these inconsistencies, we applied the machine learning approach considering the same combinations of antibodies as in IHC-decision tree algorithms. Immunohistochemistry data from a public DLBCL database was used to perform comparisons among IHC-decision tree algorithms, and the machine learning structures based on Bayesian, Bayesian simple, Naïve Bayesian, artificial neural networks, and support vector machine to show the best diagnostic model. We implemented the linear discriminant analysis over the complete database, detecting a higher influence of BCL6 antibody for GCB classification and MUM1 for non-GCB classification.
Results
The classifier with the highest metrics was the four antibody-based Perfecto–Villela (PV) algorithm with 0.94 accuracy, 0.93 specificity, and 0.95 sensitivity, with a perfect agreement with GEP (κ = 0.88,
P
< 0.001). After training, a sample of 49 Mexican-mestizo DLBCL patient data was classified by COO for the first time in a testing trial.
Conclusions
Harnessing all the available immunohistochemical data without reliance on the order of examination or cut-off value, we conclude that our PV machine learning algorithm outperforms Hans and other IHC-decision tree algorithms currently in use and represents an affordable and time-saving alternative for DLBCL cell-of-origin identification.
Electronic supplementary material
The online version of this article (10.1186/s12967-019-1951-y) contains supplementary material, which is available to authorized users.
This research examines the spatial structure of a sample of breast cancer (BC) cases and their spatial interaction with contaminated areas in the Monterrey Metropolitan Area (MMA). By applying spatial statistical techniques that treat the space as a continuum, degrees of spatial concentration were determined for the different study groups, highlighting their concentration pattern. The results indicate that 65 percent of the BC sample had exposure to more than 56 points of PM 10 . Likewise, spatial clusters of BC cases of up to 39 cases were identified within a radius of 3.5 km, interacting spatially with environmental contamination sources, particularly with refineries, food processing plants, cement, and metals. This study can serve as a platform for other clinical research by identifying geographic clusters that can help focus health policy efforts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.