The Loop Current (LC) is the dominant circulation system in the Gulf of Mexico. A long‐term prediction of the LC system (LCS) behavior is critical for understanding the Gulf of Mexico oceanography and ecosystem, and for mitigating outcomes of anthropogenic and natural disasters. In early 2018, the National Academies of Science, Engineering, and Medicine posed a challenge to the research community to develop systems that can forecast the movement of the LCS over longer periods of time than the current state of art. In this paper, a Recurrent Neural Network, the Long Short‐Term Memory (LSTM) network, is applied to predict the LC evolution and the LC ring formation. The LSTM model is trained to learn patterns hidden in sea surface height (SSH) time series. To reduce the memory demand owing to the use of high spatial resolution SSH data set, the region of interest is partitioned into nonoverlapping subregions. After partitioning, an LSTM network is trained to predict the SSH in each subregion. A smoothing function is then applied to reduce discontinuities of the SSH predictions across the partition boundaries, hence error propagation. It is shown that such a machine learning model is capable of predicting the LCS SSH evolution 9 weeks in advance within 40 km in terms of the LCS frontal distance errors. Furthermore, it is shown that the model predicted the timing and general location of eddy Darwin's shedding event 12 weeks in advance, and eddy Cameron's detachment and reattachment 8 weeks in advance.
Certain small noncoding microRNAs (miRNAs) are differentially expressed in normal tissues and cancers, which makes them great candidates for biomarkers for cancer. Previously, a selected subset of miRNAs has been experimentally verified to be linked to breast cancer. In this paper, we validated the importance of these miRNAs using a machine learning approach on miRNA expression data. We performed feature selection, using Information Gain (IG), Chi-Squared (CHI2) and Least Absolute Shrinkage and Selection Operation (LASSO), on the set of these relevant miRNAs to rank them by importance. We then performed cancer classification using these miRNAs as features using Random Forest (RF) and Support Vector Machine (SVM) classifiers. Our results demonstrated that the miRNAs ranked higher by our analysis had higher classifier performance. Performance becomes lower as the rank of the miRNA decreases, confirming that these miRNAs had different degrees of importance as biomarkers. Furthermore, we discovered that using a minimum of three miRNAs as biomarkers for breast cancers can be as effective as using the entire set of 1800 miRNAs. This work suggests that machine learning is a useful tool for functional studies of miRNAs for cancer detection and diagnosis.
Kidney cancer is one of the deadliest diseases and its diagnosis and subtype classification are crucial for patients’ survival. Thus, developing automated tools that can accurately determine kidney cancer subtypes is an urgent challenge. It has been confirmed by researchers in the biomedical field that miRNA dysregulation can cause cancer. In this paper, we propose a machine learning approach for the classification of kidney cancer subtypes using miRNA genome data. Through empirical studies we found 35 miRNAs that possess distinct key features that aid in kidney cancer subtype diagnosis. In the proposed method, Neighbourhood Component Analysis (NCA) is employed to extract discriminative features from miRNAs and Long Short Term Memory (LSTM), a type of Recurrent Neural Network, is adopted to classify a given miRNA sample into kidney cancer subtypes. In the literature, only a couple of kidney subtypes have been considered for classification. In the experimental study, we used the miRNA quantitative read counts data, which was provided by The Cancer Genome Atlas data repository (TCGA). The NCA procedure selected 35 of the most discriminative miRNAs. With this subset of miRNAs, the LSTM algorithm was able to group kidney cancer miRNAs into five subtypes with average accuracy around 95% and Matthews Correlation Coefficient value around 0.92 under 10 runs of randomly grouped 5-fold cross-validation, which were very close to the average performance of using all miRNAs for classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.