We propose an unsupervised method to obtain cross-lingual embeddings without any parallel data or pre-trained word embeddings. The proposed model, which we call multilingual neural language models, takes sentences of multiple languages as an input. The proposed model contains bidirectional LSTMs that perform as forward and backward language models, and these networks are shared among all the languages. The other parameters, i.e. word embeddings and linear transformation between hidden states and outputs, are specific to each language. The shared LSTMs can capture the common sentence structure among all languages. Accordingly, word embeddings of each language are mapped into a common latent space, making it possible to measure the similarity of words across multiple languages. We evaluate the quality of the cross-lingual word embeddings on a word alignment task. Our experiments demonstrate that our model can obtain cross-lingual embeddings of much higher quality than existing unsupervised models when only a small amount of monolingual data (i.e. 50k sentences) are available, or the domains of monolingual data are different across languages.
We propose a simple method for nominal coordination boundary identification. As the main strength of our method, it can identify the coordination boundaries without training on labeled data, and can be applied even if coordination structure annotations are not available. Our system employs pre-trained word embeddings to measure the similarities of words and detects the span of coordination, assuming that conjuncts share syntactic and semantic similarities. We demonstrate that our method yields good results in identifying coordinated noun phrases in the GENIA corpus and is comparable to a recent supervised method for the case when the coordinator conjoins simple noun phrases.
Background: Annual decline in kidney function is a widely applied surrogate outcome of renal failure. It is important to understand the relationships between known risk factors and the annual decline in estimated glomerular filtration rate (eGFR) according to baseline age; however, these remain unclear.Methods: A community-based retrospective cohort study of adults who underwent annual medical examinations between 1999 and 2013 was conducted. The participants were stratified into different age groups (40–49, 50–59, 60–69, 70–79, and ≥80 years) to assess the risk for loss of kidney function. A mixed-effects model was used to estimate the association between risk factors and annual changes in eGFR.Results: A total of 51,938 participants were included in the analysis. The age group of ≥80 years included 8,127 individuals. The mean annual change in eGFR was -0.39 (95% confidence interval -0.41 to -0.37) mL/min/1.73 m2 per year. Older age was related to faster loss of kidney function. In the older age group, higher systolic blood pressure, proteinuria, and current smoking were related to faster loss of kidney function (p trend <0.01, 0.03, and <0.01, respectively). Conversely, each age group showed similar annual loss of kidney function related to lower hemoglobin levels and diabetes mellitus (p trend 0.47 and 0.17, respectively).Conclusions: Higher systolic blood pressure, proteinuria, and smoking were related to faster loss of kidney function, and greater effect size was observed in the older participants. More risk assessments for older people are required for personalized care.
Typically, classification is conducted on a dataset that consists of numerical features and target classes. For instance, a grayscale image, which is usually represented as a matrix of integers varying from 0 to 255, enables one to apply various classification algorithms to image classification tasks. However, datasets represented as binary features cannot use many standard machine learning algorithms optimally, yet their amount is not negligible. On the other hand, oversampling algorithms such as synthetic minority oversampling technique (SMOTE) and its variants are often used if the dataset for classification is imbalanced. However, since SMOTE and its variants synthesize new minority samples based on the original samples, the diversity of the samples synthesized from binary features is highly limited due to the poor representation of original features. To solve this problem, a preprocessing approach is studied. By converting binary features into numerical ones using feature extraction methods, succeeding oversampling methods can fully display their potential in improving the classifiers’ performances. Through comprehensive experiments using benchmark datasets and real medical datasets, it was observed that a converted dataset consisting of numerical features is better for oversampling methods (maximum improvements of accuracy and F1-score were 35.11% and 42.17%, respectively). In addition, it is confirmed that feature extraction and oversampling synergistically contribute to the improvement of classification performance.
Background: Annual decline in kidney function is a widely applied surrogate outcome of renal failure. It is important to understand the relationships between known risk factors and the annual decline in estimated glomerular filtration rate (eGFR) according to baseline age; however, these remain unclear.Methods: A community-based retrospective cohort study of adults who underwent annual medical examinations between 1999 and 2013 was conducted. The participants were stratified into different age groups (40–49, 50–59, 60–69, 70–79, and ≥80 years) to assess the risk for loss of kidney function. A mixed-effects model was used to estimate the association between risk factors and annual changes in eGFR.Results: In total, 51,938 participants were included in the analysis. The age group of ≥80 years included 8,127 individuals. The mean annual change in eGFR was -0.39 (95% confidence interval: -0.41 to -0.37) mL/min/1.73 m2 per year. Older age was related to faster loss of kidney function. In the older age group, higher systolic blood pressure, proteinuria, and current smoking were related to faster loss of kidney function (p trend <0.01, 0.03, and <0.01, respectively). Conversely, each age group showed similar annual loss of kidney function related to lower hemoglobin levels and diabetes mellitus (p trend 0.47 and 0.17, respectively).Conclusions: Higher systolic blood pressure, proteinuria, and smoking were related to faster loss of kidney function, and a greater effect size was observed in the older participants. More risk assessments for older people are required for personalized care.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.