2020
DOI: 10.1016/j.jedc.2020.103895
|View full text |Cite
|
Sign up to set email alerts
|

Separating the signal from the noise – Financial machine learning for Twitter

Abstract: Most statistical arbitrage strategies in the academic literature soley rely on price time series. By contrast, alternative data sources are of growing importance for professional investors. We contribute to bridging this gap by assessing the price-predictive value of more than nine million tweets on intraday returns of the S&P 500 constituents. For this purpose, we design a machine learning pipeline addressing specific challenges inherent to this task. At first, we engineer domain-specific features along three… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 22 publications
(10 citation statements)
references
References 49 publications
0
10
0
Order By: Relevance
“…(3) Random forests are not prone to overfitting (Breiman, 2001) and are fairly robust to noise when compared to boosting techniques (Khoshgoftaar et al, 2011). ( 4) For the specific task addressed in our paper, i.e., stock return prediction, they have empirically been found to have very good predictive performance, and perform similar or better than neural networks (Krauss et al, 2017;Gu et al, 2018;Schnaubelt et al, 2020). ( 5) Compared to neural networks and popular deep learning architectures, random forests have far less hyperparameters.…”
Section: Random Forest Modelmentioning
confidence: 97%
See 4 more Smart Citations
“…(3) Random forests are not prone to overfitting (Breiman, 2001) and are fairly robust to noise when compared to boosting techniques (Khoshgoftaar et al, 2011). ( 4) For the specific task addressed in our paper, i.e., stock return prediction, they have empirically been found to have very good predictive performance, and perform similar or better than neural networks (Krauss et al, 2017;Gu et al, 2018;Schnaubelt et al, 2020). ( 5) Compared to neural networks and popular deep learning architectures, random forests have far less hyperparameters.…”
Section: Random Forest Modelmentioning
confidence: 97%
“…Keeping the ease of implementation in mind, we offset positions with an opposing investment in the iShares Core Total U.S. Stock Market ETF (ticker symbol ITOT, BlackRock Inc., 2020). Instead of replicating the size-decile portfolios used to calculate abnormal returns, this approach is common in the statistical arbitrage literature to achieve market-neutrality (see, for example, Avellaneda and Lee, 2010;Schnaubelt et al, 2020).…”
Section: Announcement-based Trading Strategymentioning
confidence: 99%
See 3 more Smart Citations