Novel non-parametric dimensionality reduction techniques such as t-distributed stochastic neighbor embedding (t-SNE) lead to a powerful and flexible visualization of high-dimensional data. One drawback of non-parametric techniques is their lack of an explicit out-of-sample extension. In this contribution, we propose an efficient extension of t-SNE to a parametric framework, kernel t-SNE, which preserves the flexibility of basic t-SNE, but enables explicit out-of-sample extensions. We test the ability of kernel t-SNE in comparison to standard t-SNE for benchmark data sets, in particular addressing the generalization ability of the mapping for novel data. In the context of large data sets, this procedure enables us to train a mapping for a fixed size subset only, mapping all data afterwards in linear time. We demonstrate that this technique yields satisfactory results also for large data sets provided missing information due to the small size of the subset is accounted for by auxiliary information such as class labels, which can be integrated into kernel t-SNE based on the Fisher information.
We give results from a detailed analysis of human Ribosomal Protein (RP) levels in normal and cancer samples and cell lines from large mRNA, copy number variation and ribosome profiling datasets. After normalizing total RP mRNA levels per sample, we find highly consistent tissue specific RP mRNA signatures in normal and tumor samples. Multiple RP mRNA-subtypes exist in several cancers, with significant survival and genomic differences. Some RP mRNA variations among subtypes correlate with copy number loss of RP genes. In kidney cancer, RP subtypes map to molecular subtypes related to cell-of-origin. Pan-cancer analysis of TCGA data showed widespread single/double copy loss of RP genes, without significantly affecting survival. In several cancer cell lines, CRISPR-Cas9 knockout of RP genes did not affect cell viability. Matched RP ribosome profiling and mRNA data in humans and rodents stratified by tissue and development stage and were strongly correlated, showing that RP translation rates were proportional to mRNA levels. In a small dataset of human adult and fetal tissues, RP protein levels showed development stage and tissue specific heterogeneity of RP levels. Our results suggest that heterogeneous RP levels play a significant functional role in cellular physiology, in both normal and disease states.
Albeit automated classifiers offer a standard tool in many application areas, there exists hardly a generic possibility to directly inspect their behavior, which goes beyond the mere classification of (sets of) data points. In this contribution, we propose a general framework how to visualize a given classifier and its behavior as concerns a given data set in two dimensions. More specifically, we use modern nonlinear dimensionality reduction (DR) techniques to project a given set of data points and their relation to the classification decision boundaries. Furthermore, since data are usually intrinsically more than two-dimensional and hence cannot be projected to two dimensions without information loss, we propose to use discriminative DR methods which shape the projection according to given class labeling as is the case for a classification setting. With a given data set, this framework can be used to visualize any trained classifier which provides a probability or certainty of the classification together with the predicted class label.We demonstrate the suitability of the framework in the context of different dimensionality reduction techniques, in the context of different attention foci as concerns the visualization, and as concerns different classifiers which should be visualized.
This paper analyzes how banks react to the financial crisis and a deteriorating solvency and liquidity condition in their investment decisions and the composition of their financial assets. We use a novel dataset, which comprises all security investments by all German banks on a security-by-security basis between 2006 and 2011, and analyze whether and how banks use sales and purchases of these securities as the most direct and immediate way to change their overall asset structure. We find that banks substantially change their investment strategies with the beginning of the financial crisis. In particular, they shift their investments towards securities that are eligible as collateral in central bank credit operations and towards domestic securities. These patterns hold in particular for less healthy, lowly rated and large banks. Furthermore, banks with substantial exposure to troubled assets as for example Greek government bonds are particularly active in this perspective. Our results highlight the substantial changes in bank portfolios following the financial crisis, which constitute a major part of their assets, and have important implications for the current regulatory as well as policy debate on banks' investment decisions.
Research on machine learning approaches for upper limb prosthesis control has shown impressive progress. However, translating these results from the lab to patient's everyday lives remains a challenge, because advanced control schemes tend to break down under everyday disturbances, such as electrode shifts. Recently, it has been suggested to apply adaptive transfer learning to counteract electrode shifts using as little newly recorded training data as possible. In this paper, we present a novel, simple version of transfer learning and provide the first user study demonstrating the effectiveness of transfer learning to counteract electrode shifts. For this purpose, we introduce the novel Box and Beans test to evaluate prosthesis proficiency and compare user performance with an initial simple pattern recognition system, the system under electrode shifts, and the system after transfer learning. Our results show that transfer learning could significantly alleviate the impact of electrode shifts on user performance in the Box and Beans test.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.