Abstract-Factor analysis provides linear factors that describe relationships between individual variables of a data set. We extend this classical formulation into linear factors that describe relationships between groups of variables, where each group represents either a set of related variables or a data set. The model also naturally extends canonical correlation analysis to more than two sets, in a way that is more flexible than previous extensions. Our solution is formulated as variational inference of a latent variable model with structural sparsity, and it consists of two hierarchical levels: The higher level models the relationships between the groups, whereas the lower models the observed variables given the higher level. We show that the resulting solution solves the group factor analysis problem accurately, outperforming alternative factor analysis based solutions as well as more straightforward implementations of group factor analysis. The method is demonstrated on two life science data sets, one on brain activation and the other on systems biology, illustrating its applicability to the analysis of different types of highdimensional data sources.
Background. Unlike in most adult-onset cancers, an association between typical paediatric neoplasms and inflammatory triggers is rare. We studied whether immune system-related genes are activated and have prognostic significance in Ewing's sarcoma family of tumors (ESFTs). Method. Data analysis was performed on gene expression profiles of 44 ESFT patients, 11 ESFT cell lines, and 18 normal skeletal muscle samples. Differential expression of 238 inflammation and 299 macrophage-related genes was analysed by t-test, and survival analysis was performed according to gene expression. Results. Inflammatory genes are activated in ESFT patient samples, as 38 of 238 (16%) inflammatory genes were upregulated (P < 0.001) when compared to cell lines. This inflammatory gene activation was characterized by significant enrichment of macrophage-related gene expression with 58 of 299 (19%) of genes upregulated (P < 0.001). High expression of complement component 5 (C5) correlated with better event-free (P = 0.01) and overall survival (P = 0.004) in a dose-dependent manner. C5 and its receptor C5aR1 expression was verified at protein level by immunohistochemistry on an independent ESFT tumour tissue microarray. Conclusion. Immune system-related gene activation is observed in ESFT patient samples, and prognostically significant inflammatory genes (C5, JAK1, and IL8) for ESFT were identified.
Studies on retention and success in introductory programming courses have suggested that previous programming experience contributes to students' course outcomes. If such background information could be automatically distilled from students' working process, additional guidance and support mechanisms could be provided even to those who do not wish to disclose such information. In this study, we explore methods for automatically distinguishing novice programmers from more experienced programmers using finegrained source code snapshot data. We approach the issue by partially replicating a previous study that used students' keystroke latencies as a proxy to introductory programming course outcomes, and follow this with an exploration of machine learning methods to separate students with little to no previous programming experience from those with more experience. Our results confirm that students' keystroke latencies can be used as a metric for measuring course outcomes. At the same time, our results show that students' programming experience can be identified to some extent from keystroke latency data, which means that such data has potential as a source of information for customizing the students' learning experience.
BackgroundEwing sarcoma family of tumors (ESFT), characterized by t(11;22)(q24;q12), is one of the most common tumors of bone in children and young adults. In addition to EWS/FLI1 gene fusion, copy number changes are known to be significant for the underlying neoplastic development of ESFT and for patient outcome. Our genome-wide high-resolution analysis aspired to pinpoint genomic regions of highest interest and possible target genes in these areas.MethodsArray comparative genomic hybridization (CGH) and expression arrays were used to screen for copy number alterations and expression changes in ESFT patient samples. A total of 31 ESFT samples were analyzed by aCGH and in 16 patients DNA and RNA level data, created by expression arrays, was integrated. Time of the follow-up of these patients was 5–192 months. Clinical outcome was statistically evaluated by Kaplan-Meier/Logrank methods and RT-PCR was applied on 42 patient samples to study the gene of the highest interest.ResultsCopy number changes were detected in 87% of the cases. The most recurrent copy number changes were gains at 1q, 2, 8, and 12, and losses at 9p and 16q. Cumulative event free survival (ESFT) and overall survival (OS) were significantly better (P < 0.05) for primary tumors with three or less copy number changes than for tumors with higher number of copy number aberrations. In three samples copy number imbalances were detected in chromosomes 11 and 22 affecting the FLI1 and EWSR1 loci, suggesting that an unbalanced t(11;22) and subsequent duplication of the derivative chromosome harboring fusion gene is a common event in ESFT. Further, amplifications on chromosomes 20 and 22 seen in one patient sample suggest a novel translocation type between EWSR1 and an unidentified fusion partner at 20q. In total 20 novel ESFT associated putative oncogenes and tumor suppressor genes were found in the integration analysis of array CGH and expression data. Quantitative RT-PCR to study the expression levels of the most interesting gene, HDGF, confirmed that its expression was higher than in control samples. However, no association between HDGF expression and patient survival was observed.ConclusionWe conclude that array CGH and integration analysis proved to be effective methods to identify chromosome regions and novel target genes involved in the tumorigenesis of ESFT.
Being able to identify the user of a computer solely based on their typing patterns can lead to improvements in plagiarism detection, provide new opportunities for authentication, and enable novel guidance methods in tutoring systems. However, at the same time, if such identification is possible, new privacy and ethical concerns arise. In our work, we explore methods for identifying individuals from typing data captured by a programming environment as these individuals are learning to program. We compare the identification accuracy of automatically generated user profiles, ranging from the average amount of time that a user needs between keystrokes to the amount of time that it takes for the user to press specific pairs of keys, digraphs. We also explore the effect of data quantity and different acceptance thresholds on the identification accuracy, and analyze how the accuracy changes when identifying individuals across courses. Our results show that, while the identification accuracy varies depending on data quantity and the method, identification of users based on their programming data is possible. These results indicate that there is potential in using this method, for example, in identification of students taking exams, and that such data has privacy concerns that should be addressed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.