Previous research indicates that the narration disclosure in company annual reportscan be used to assist in assessing the company's short-term financial prospects. However, not much effort has been made to systematically and automatically assess the predictive potential of such reports using text classification, information retrieval, and machine learning techniques. In this study, we built SVM-based predictive models with different feature selection methods from ten years of annual reports of 30 companies. We used feature selection methods to reduce the term space and studied the class-related vocabulary. Evaluation of predictive accuracy is performed with cross validation and t-test significance tests. We compare different models' performance and analyze misclassification rates by year and by industry. We identify the strengths and weaknesses of each model. Our results support the feasibility of automatically predicting next-year company financial performance from the current year's report. We suggest text features can be further studied to understand their roles as indicators of company's future performance. This research paves the way for large-scale automatic analysis of the relationship between annual reports and short-term performance, as well as the identification of interesting signals within annual reports.
IntroductionCompany annual reports (10K filings) are freely available to the public and contain required disclosures, quantitative summaries of the company's financial performance as well as textual discussions. These reports are of great importance in helping investors, corporate managers, and financial analysts with their decision-making. Studies have shown that the narration sections of 10K filings provide information that is as useful as the financial ratios to financial analysts while predicting the company's future prospects (Roger & Grant,1997, Schipper, 1991. The SEC (Securities and Exchange Commission) also requires the reporting of the firm's strategies and managerial priorities, and its view of the past year's performance and future prospects. The major mandatory disclosures in annual reports include reasons for price and sales changes, reasons for revenue and cost changes, planned expenditures, known trends, and future liquidity positions.Annual reports have been studied as a marketing and communication tool that the corporation uses to convey an image or messages to its stakeholders (Herreman & Ryans, 1995). More recent studies on the relationship between the reports and firm performance have focused on special sections of the reports, such as the chairman's statement (Smith & Taffler, 2000), management discussion and analysis (MD&A) (Bryan, 1997), president's letter (Abrahamson & Amir, 1996) as well as the general writing style and readability (Subramanian et al., 1993). The methods these studies employ are generally semi-automatic, including content analysis, readability measurements, manual annotation and categorization, linear discriminant analysis, logit model and other statistical...