This paper presents the concept and uses of a real-time data set that can be used by economists for testing the robustness of published econometric results, for analyzing policy, and for forecasting. The data set consists of vintages, or snapshots, of the major macroeconomic data available at quarterly intervals in real time. The paper illustrates why such data may matter, explains the construction of the data set, examines the properties of several of the variables in the data set across vintages, examines key empirical papers in macroeconomics and investigates their robustness to different vintages, looks at how policy analysis may be affected by data revisions, and shows how forecasts can be affected by data revisions.
This paper describes the existing research (as of February 2008) on real-time data analysis, divided into five areas: (1) data revisions; (2) forecasting; (3) monetary policy analysis; (4) macroeconomic research; and (5) current analysis of business and financial conditions. In each area, substantial progress has been made in recent years, with researchers gaining insight into the impact of data revisions. In addition, substantial progress has been made in developing better real-time data sets around the world. Still, additional research is needed in key areas, and research to date has uncovered even more fruitful areas worth exploring.
This paper discusses how forecasts are affected by the use of real-time data rather than latest-available data. The key issue is this: In the literature on developing forecasting models, new models are put together based on the results they yield using the data set available to the model's developer. But those are not the data that were available to a forecaster in real time. How much difference does the vintage of the data make for such forecasts? We explore this issue with a variety of exercises designed to answer this question. In particular, we find that the use of real-time data matters for some forecasting issues but not for others. It matters for choosing lag length in a univariate context. Preliminary evidence suggests that the span-or number-of forecast observations used to evaluate models may also be critical: we find that standard measures of forecast accuracy can be vintage-sensitive when constructed on the short spans (five years of quarterly data) of data sometimes used by researchers for forecast evaluation. The differences between using real-time and latest-available data may depend on what is being used as the "actual" or realization, and we explore several alternatives that can be used. Perhaps of most importance, we show that measures of forecast error, such as root-mean-squared error and mean absolute error, can be deceptively lower when using latest-available data rather than real-time data. Thus, for purposes such as modeling expectations or evaluating forecast errors of survey data, the use of latest-available data is questionable; comparisons between the forecasts generated from new models and benchmark forecasts, generated in real time, should be based on real-time data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.