Nowadays, a big part of people rely on available content in social media in their decisions (e.g. reviews and feedback on a topic or product). The possibility that anybody can leave a review provide a golden opportunity for spammers to write spam reviews about products and services for different interests. Identifying these spammers and the spam content is a hot topic of research and although a considerable number of studies have been done recently toward this end, but so far the methodologies put forth still barely detect spam reviews, and none of them show the importance of each extracted feature type. In this study, we propose a novel framework, named NetSpam, which utilizes spam features for modeling review datasets as heterogeneous information networks to map spam detection procedure into a classification problem in such networks. Using the importance of spam features help us to obtain better results in terms of different metrics experimented on real-world review datasets from Yelp and Amazon websites. The results show that NetSpam outperforms the existing methods and among four categories of features; including review-behavioral, user-behavioral, reviewlinguistic, user-linguistic, the first type of features performs better than the other categories.
Cardiovascular Disease (CVD) is considered as one of the principal causes of death in the world. Over recent years, this field of study has attracted researchers' attention to investigate heart sounds' patterns for disease diagnostics. In this study, an approach is proposed for normal/abnormal heart sound classification on the Physionet challenge 2016 dataset. For the first time, a fixed length feature vector; called i-vector; is extracted from each heart sound using Mel Frequency Cepstral Coefficient (MFCC) features. Afterwards, Principal Component Analysis (PCA) transform and Variational Autoencoder (VAE) are applied on the ivector to achieve dimension reduction. Eventually, the reduced size vector is fed to Gaussian Mixture Models (GMMs) and Support Vector Machine (SVM) for classification purpose. Experimental results demonstrate the proposed method could achieve a performance improvement of 16% based on Modified Accuracy (MAcc) compared with the baseline system on the Physionet2016 dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.