2014
DOI: 10.1093/annhyg/meu006
|View full text |Cite
|
Sign up to set email alerts
|

Beyond Crosswalks: Reliability of Exposure Assessment Following Automated Coding of Free-Text Job Descriptions for Occupational Epidemiology

Abstract: Epidemiologists typically collect narrative descriptions of occupational histories because these are less prone than self-reported exposures to recall bias of exposure to a specific hazard. However, the task of coding these narratives can be daunting and prohibitively time-consuming in some settings. The aim of this manuscript is to evaluate the performance of a computer algorithm to translate the narrative description of occupational codes into standard classification of jobs (2010 Standard Occupational Class… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(10 citation statements)
references
References 17 publications
0
10
0
Order By: Relevance
“…Code rates were 96 to 97 percent and matched the manual code 89 percent of the time. Similarly, another study reported on manual versus automated coding of occupation data and assignment of occupational exposure in a large governmental database, finding that automated coding of occupations results in assignment of exposures in reasonable agreement with results from manual coding (Burstyn et al 2014).…”
Section: Introductionmentioning
confidence: 77%
“…Code rates were 96 to 97 percent and matched the manual code 89 percent of the time. Similarly, another study reported on manual versus automated coding of occupation data and assignment of occupational exposure in a large governmental database, finding that automated coding of occupations results in assignment of exposures in reasonable agreement with results from manual coding (Burstyn et al 2014).…”
Section: Introductionmentioning
confidence: 77%
“…Even at the current level of accuracy, significant time and cost savings are possible over manual coding by insurers or researchers in Canada. As identified by Burstyn et al [ 9 ], it can take manual coders days to months to code a few 100 occupations to a classification. They reported that it can take 2 months to code approximately 1600 free text descriptors of lifetime occupational histories to the 2010 SOC.…”
Section: Discussionmentioning
confidence: 99%
“…In one approach, an initial match between the input data and the text description of a top-level class triggers further matchmaking at lower levels in the hierarchy. Another method is to directly compare the input data with each of the hierarchical levels [ 9 ]. In the same study, Burstyn et al [ 9 ] applied an algorithm that mixes both methods to code against the SOC (2010) system.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Unstructured text fields of job title and industry were matched using a computer algorithm to 2010 Standard Occupational Classification (SOC) codes, a system used by Federal agencies to classify workers into occupational categories, at the highest level of detail available [38]. Precision of the match (percent agreement) was evaluated using a comparison to a subset of manually coded occupations and was found to be adequate, ranging in agreement from 0.72 (or “moderate”) for ”broad occupation” groups (the first 4–5 digits in SOC codes) to 0.85 (“very good”) for “minor groups” (the first 3 digits in SOC codes) [39]. Standard codes for job titles were linked to characteristics in the Occupational Information Network (O*NET), a database of occupational descriptions widely used to identify characteristics and demands of jobs [40].…”
Section: Methodsmentioning
confidence: 99%