The click-through rate (CTR) prediction task is used to estimate the probabilities of users clicking on recommended items, which are extremely important in recommender systems. Recently, the deep factorization machine (DeepFM) algorithm was proposed. The DeepFM algorithm incorporates a factorization machine (FM) to learn not only low-order features but also the interactions of higher-order features. However, DeepFM lacks user diversity representations and does not consider the text. In view of this, we propose a text-attention FM (TAFM) based on the DeepFM algorithm. First, the attention mechanism in the TAFM algorithm is used to address the diverse representations of users and goods and to mine the features that are most interesting to users. Second, the TAFM model can fully learn text features through its text component, text attention component, and N-gram text feature extraction component, which can fully explore potential user preferences and the diversity among user interests. In addition, the convolutional autoencoder in the TAFM can learn some higher-level features, and the higher-order feature mining process is more comprehensive. On the public dataset, the better performing models in the existing models are deep cross network (DCN), DeepFM, and product-based neural network (PNN), respectively, and the AUC score metrics of these models hover between 0.698 and 0.699. The AUC score of our design model is 0.730, which is at least 3% higher than that of the existing models. The accuracy metric of our model is at least 0.1 percentage points higher than that of existing models.
Premature ventricular contractions (PVCs) are one of the most common cardiovascular diseases with high risk to a large population of patients. It has been shown that supervised learning algorithms can detect PVCs from beat-level ECG data. However, a huge human effort is needed in order to achieve an accurate detection rate. A convolutional autoencoder was trained in this work in an unsupervised fashion to extract features automatically with zero prior specialized knowledge. Random forest was adopted as a supervised algorithm trained on the features generated by the autoencoder. Various active learning selection strategies, uncertainty-based and diversity-based, were studied on top of the random forest. In each iteration of active learning, the training data are updated with newly selected samples and fed into the classifier. The performance on an independent validation set is recorded in each iteration. As a result, among the different uncertainty sampling strategies, the least confidence score shows a better F1 score of 0.85 than other methods. In between the two diversity-based strategies, the representative clustering sample had the best F1 score than the k-center-greedy algorithm. By comparing the performance of different active learning methods trained on half of the original data size with the same classifier trained on the full set, the F1 score of least confidence is still better than the full set. This study demonstrates that active learning could help reduce human annotation effort by achieving the same level of performance as the classifier trained on the fully annotated training data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.