Abstract. Over the past decade, data science and machine learning has grown from a mysterious art form to a staple tool across a variety of fields in academia, business, and government. In this paper, we introduce the concept of tree-based pipeline optimization for automating one of the most tedious parts of machine learning-pipeline design. We implement a Tree-based Pipeline Optimization Tool (TPOT) and demonstrate its effectiveness on a series of simulated and real-world genetic data sets. In particular, we show that TPOT can build machine learning pipelines that achieve competitive classification accuracy and discover novel pipeline operators-such as synthetic feature constructors-that significantly improve classification accuracy on these data sets. We also highlight the current challenges to pipeline optimization, such as the tendency to produce pipelines that overfit the data, and suggest future research paths to overcome these challenges. As such, this work represents an early step toward fully automating machine learning pipeline design.
Risk factors for gastric cancer are receiving renewed attention in light of the recent positive association of Helicobacter pylori infection with gastric cancer. The effect of H.pylori on the balance between oxidants and antioxidants in the stomach is not well known. In this study, we investigated if exposure of gastric cells to H. pylori increases oxidant-associated gastric epithelial cell injury. A human gastric epithelial cell line (AGS) was grown on 96-well clusters, then exposed overnight to either live H.pylori (four cagA(+) and four cagA(-)) or broth culture supernatant from an isogenic H.pylori cagA(+) strain with and without vacA activity. Incubation of AGS cells with cagA(+) and cagA(-) H.pylori strains before exposure to reactive oxygen species (ROS) reduced cell viability on average to 73.7% and 39.5% of controls, respectively. The percent viability of cells exposed to ROS after incubation with control broth, vacA(-) broth and vacA(+) broth was 97.7%, 70.5% and 63.5%, respectively. Experiments were then performed to evaluate the effects of H.pylori exposure on the activities of ROS-scavenging enzymes [catalase, glutathione peroxidase and superoxide dismutase (SOD)] and formation of 8-hydroxy-2-deoxyguanosine (8-OH-dG) adducts in AGS cells. Overnight exposure to cagA(-) strains reduced catalase activity by 42%; in contrast, exposure to cagA(+) H.pylori strains increased catalase activity by 51%. Glutathione peroxidase activity increased with exposure to both cagA(-) and cagA(+) strains by 95% and 240%, respectively. Total SOD activity increased 156% after exposure to cagA(+) strains and was marginally increased (52%) with exposure to cagA(-) strains. CuZn-SOD protein levels, assayed by enzyme-linked immunosorbent assay, were not significantly altered by exposure to H.pylori strains; however, Mn-SOD concentrations were significantly increased (P: < 0.02) after exposure to both cagA(-) and cagA(+) H.pylori strains. Exposure of AGS cells to cagA(+) and cagA(-) H.pylori was associated with, on average, 44.5 and 99.0 8-OH-dG/10(6) dG, respectively. The increase in catalase, glutathione peroxidase and SOD activity is associated with fewer 8-OH-dG DNA adducts and reduced susceptibility of AGS cells to lethal injury from ROS after exposure to cagA(+) H.pylori strains when compared with exposure to cagA(-) H.pylori strains. Alteration in the activity of ROS-scavenging enzymes by the presence of H. pylori may in part be responsible for the increased risk of gastric cancer in persons infected with H.pylori.
Animal and in vitro models of prostate cancer demonstrate high IL-10 levels result in smaller tumors, fewer metastases, and reduced angiogenesis compared to controls. We sought to examine the hypothesis that genotypes correlated with low IL-10 production may be associated with increased prostate cancer risk among Finnish male participants from the Alpha-tocopherol Beta-carotene Cancer Prevention Study. DNA from 584 prostate cancer cases and 584 controls was genotyped for four IL-10 alleles, -1082, -819, -592, and 210. DNA from more of the controls than cases failed to amplify, resulting in 509 cases and 382 controls with genotype data for -1082; 507 and 384 for -819; 511 and 386 for -592; and 491 and 362 for 210. Odds ratios for the association between the IL-10 genotypes and risk of prostate cancer or, among cases only, high-grade disease were calculated using logistic regression. In this population, the -819 TT and -592 AA low expression genotypes were highly correlated. These two genotypes also were associated with increased prostate cancer susceptibility (OR = 1.92, 95% CI 1.07-3.43 for -819) and, among cases, with highgrade tumors (OR = 2.56, 95% CI 1.26-5.20 for -819). These data demonstrate genotypes correlated with low IL-10 production are associated with increased risk of prostate cancer and with high-grade disease in this population.
BACKGROUND Prostate cancer (PCa) incidence and mortality are disproportionately high among African-American (AA) men. Its detection and perhaps its disparities could be improved through the identification of genetic susceptibility biomarkers within essential biological pathways. Interactions among highly variant genes, central to angiogenesis, may modulate susceptibility for prostate cancer, as previous demonstrated. This study evaluates the interplay among three highly variant genes (i.e., IL-10, TGFβR-1, VEGF), their receptors and their influence on PCa within a case-control study consisting of an under-served population. METHODS This study evaluated single gene and joint modifying effects on PCa risk in a case-control study comprised of 859 AA men (193 cases and 666 controls) using TaqMan qPCR. Interaction among polymorphic IL-10, TGFβR-1 and VEGF was analyzed using conventional logistic regression analysis (LR) models, multi-dimensionality reduction (MDR) and interaction entropy graphs. Symbolic modeling allowed validation of gene–gene interaction findings identified by MDR. RESULTS No significant single gene effects were demonstrated in relation to PCa risk. However, carriers of the VEGF 2482T allele had a threefold increase in the risk of developing aggressive PCa. The presence of VEGF 2482T combined with VEGFR IVS6 + 54 loci were highly significant for the risk of PCa based on MDR and symbolic modeling analyses. These findings were substantiated by 1,000-fold cross validation permutation testing (P = 0.04), respectively. CONCLUSION These findings suggest the inheritance of VEGF and VEGFR IVS6 + 54 sequence variants may jointly modify PCa susceptibility through their influence on angiogenesis. Larger sub-population studies are needed to validate these findings and evaluate whether the VEGF-VEGR axis may serve as predictors of disease prognosis and ultimately clinical response to available treatment strategies.
BackgroundPolymorphisms in glutathione S-transferase (GST) genes may influence response to oxidative stress and modify prostate cancer (PCA) susceptibility. These enzymes generally detoxify endogenous and exogenous agents, but also participate in the activation and inactivation of oxidative metabolites that may contribute to PCA development. Genetic variations within selected GST genes may influence PCA risk following exposure to carcinogen compounds found in cigarette smoke and decreased the ability to detoxify them. Thus, we evaluated the effects of polymorphic GSTs (M1, T1, and P1) alone and combined with cigarette smoking on PCA susceptibility.MethodsIn order to evaluate the effects of GST polymorphisms in relation to PCA risk, we used TaqMan allelic discrimination assays along with a multi-faceted statistical strategy involving conventional and advanced statistical methodologies (e.g., Multifactor Dimensionality Reduction and Interaction Graphs). Genetic profiles collected from 873 men of African-descent (208 cases and 665 controls) were utilized to systematically evaluate the single and joint modifying effects of GSTM1 and GSTT1 gene deletions, GSTP1 105 Val and cigarette smoking on PCA risk.ResultsWe observed a moderately significant association between risk among men possessing at least one variant GSTP1 105 Val allele (OR = 1.56; 95%CI = 0.95-2.58; p = 0.049), which was confirmed by MDR permutation testing (p = 0.001). We did not observe any significant single gene effects among GSTM1 (OR = 1.08; 95%CI = 0.65-1.82; p = 0.718) and GSTT1 (OR = 1.15; 95%CI = 0.66-2.02; p = 0.622) on PCA risk among all subjects. Although the GSTM1-GSTP1 pairwise combination was selected as the best two factor LR and MDR models (p = 0.01), assessment of the hierarchical entropy graph suggested that the observed synergistic effect was primarily driven by the GSTP1 Val marker. Notably, the GSTM1-GSTP1 axis did not provide additional information gain when compared to either loci alone based on a hierarchical entropy algorithm and graph. Smoking status did not significantly modify the relationship between the GST SNPs and PCA.ConclusionA moderately significant association was observed between PCA risk and men possessing at least one variant GSTP1 105 Val allele (p = 0.049) among men of African descent. We also observed a 2.1-fold increase in PCA risk associated with men possessing the GSTP1 (Val/Val) and GSTM1 (*1/*1 + *1/*0) alleles. MDR analysis validated these findings; detecting GSTP1 105 Val (p = 0.001) as the best single factor for predicting PCA risk. Our findings emphasize the importance of utilizing a combination of traditional and advanced statistical tools to identify and validate single gene and multi-locus interactions in relation to cancer susceptibility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.