Due to the expansion of Internet and Web 2.0 phenomenon, there is a growing interest in sentiment analysis of freely opinionated text. In this paper, we propose a novel cross-source cross-domain sentiment classification, in which cross-domain-labeled Web sources (Amazon and Tripadvisor) are used to train supervised learning models (including two deep learning algorithms) that are tested on typically nonlabeled social media reviews (Facebook and Twitter). We explored a three-step methodology, in which distinct balanced training, text preprocessing and machine learning methods were tested, using two languages: English and Italian. The best results were achieved using undersampling training and a Convolutional Neural Network. Interesting cross-source classification performances were achieved, in particular when using Amazon and Tripadvisor reviews to train a model that is tested on Facebook data for both English and Italian.
User location data is valuable for diverse social media analytics. In this paper, we address the non-trivial task of estimating a worldwide city-level Twitter user location considering only historical tweets. We propose a purely unsupervised approach (no location data is used) that is based on a synthetic geographic sampling of Google Trends (GT) city-level frequencies of tweet nouns and three clustering algorithms. The approach was validated empirically by using a recently collected dataset, with 3,268 worldwide city-level locations of Twitter users, obtaining competitive results when compared with a state-of-the-art Word Distribution (WD) user location estimation method. The best overall results were achieved by the GT noun (GTN) DBSCAN (GTN-DB) method, which is computationally fast, and correctly predicts the ground truth locations of 15%, 23%, 39% and 58% of the users for tolerance distances of 250 km, 500 km, 1,000 km and 2,000 km.
This paper discusses the application of five t-GARCH models to the problem of accurately modeling three univariate but mutually dependent wind speed series taken from three US metering sites distant few kilometers from each other. Besides a benchmark model consisting of three independent univariate t-GARCH models, a t-CCC, a t-DCC, a t-copula/t-CCC and a t-copula/t-DCC model will be estimated, studied in their unconditional (i.e. static) and conditional (i.e. fully dynamic) statistical features, and compared to each other and to some statistical features of the original series. In order to highlight the usefulness of choosing volatility-oriented modeling such as\ud
multivariate GARCH modeling for wind speed series, an Energy Finance application of capital budgeting under risk, i.e. energy portfolio selection, will be discussed and applied to the five\ud
modeling schemes
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.