2019
DOI: 10.3386/w25837
|View full text |Cite
|
Sign up to set email alerts
|

Transforming Naturally Occurring Text Data Into Economic Statistics: The Case of Online Job Vacancy Postings

Abstract: Using a dataset of 15 million UK job adverts from a recruitment website, we construct new economic statistics measuring labour market demand. These data are 'naturally occurring', having originally been posted online by firms. They offer information on two dimensions of vacancies-region and occupationthat firm-based surveys do not usually, and cannot easily, collect. These data do not come with official classification labels so we develop an algorithm which maps the free form text of job descriptions into stan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(12 citation statements)
references
References 27 publications
0
12
0
Order By: Relevance
“…The use of such data, in fact, requires the resolution of problems linked to definition and data collection (duplication, selection of stable and credible sources, removing non‐work ads, selection of active ads etc. ), the transformation from OJA flows to stocks (Garasto et al, 2021; Turrell et al, 2019) and issues such as non‐representativeness and lack of coverage at the population level (Couper, 2013; Fan et al, 2014; Japec et al, 2015; Kureková et al, 2015; Tam & Clarke, 2015).…”
Section: Discussionmentioning
confidence: 99%
“…The use of such data, in fact, requires the resolution of problems linked to definition and data collection (duplication, selection of stable and credible sources, removing non‐work ads, selection of active ads etc. ), the transformation from OJA flows to stocks (Garasto et al, 2021; Turrell et al, 2019) and issues such as non‐representativeness and lack of coverage at the population level (Couper, 2013; Fan et al, 2014; Japec et al, 2015; Kureková et al, 2015; Tam & Clarke, 2015).…”
Section: Discussionmentioning
confidence: 99%
“…This suggests to define as post-sampling weight balancing the quarterly stock of vacancies by industry according to online vacancies towards the quarterly stock of vacancies by industry according to the JVS: this produces a set of "post-sampling" weights for each quarter and industry, that can be assigned to each vacancy or vacancy distributions (by relevant auxiliary variables, such as NUTS, ISCO, NACE, Quarters and possible interactions). Recent works [16,17] adopt such kind of posts-stratification. Over or under-representation (for univariate or two-way or three way interactions) in online vacancies can be easily assessed by the ratio between percentage distributions of online counts and poststratified ones: If the ratio is higher than 1, it means that a certain category (industry, occupation) is likely to be over-represented in the online job adverts dataset, whereas the opposite is true with ratio is less than 1.…”
Section: Discussionmentioning
confidence: 99%
“…As a consequence, using online job vacancies as a data source for calculating common labour market indicators (such as number of vacancies, labour market tightness, degree of skill mismatch) has become a relatively common practice for scholars and practitioners (Japec and Lyberg 2020;Štefánik, Lyócsa, and Bilka [2022]; Turrell et al 2019). Interesting micro-level applications of online labour market data, beyond skills analysis, include those focusing on the value of the migration experience in employers' demands (Kureková and Žilinčíková, 2018); the role of occupational mismatch in explaining the productivity puzzle (Turrell et al 2021); the relationship between firm credit crunch and employee job search behaviour (Gortmaker, Jeffers, and Lee 2021); discrimination against women in the labour market (Kuhn and Shen 2013); or links between the introduction of unemployment benefits, job searches and job postings during the Great Recession in the US (Marinescu 2017).…”
Section: Online Data In Labour Market Research: Trends and Characteri...mentioning
confidence: 99%
“…For example, it is generally rarely the case that the representative dataset would contain information about sector and occupation, coded in a way that allows direct comparison with the online data. It follows that a reweighting strategy based on sectoral representation will align the online data closely with the representative source, in terms of the representation of individual sectors; but this will only address the difference in the studied occupations, to the extent that this difference is caused by different sectoral coverage (Turrell et al 2019). In the previous parts of this paper we emphasized other problems with applying weights to online job vacancy data: mainly, the essential difference between measuring stocks/matched jobs versus flows/unmatched jobs, in (representative) employment statistics and in online vacancy data, respectively.…”
Section: Statistical Techniquesmentioning
confidence: 99%