2019
DOI: 10.1007/s11192-019-03246-1
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of the effect of data properties in automated patent classification

Abstract: Patent classification is a task performed in patent offices around the world by experts, where they assign category codes to a patent application based on its technical content. Nowadays, the number of applications is constantly growing and there is an economical interest on developing accurate and fast models to automate the classification task. In this paper, we present a methodology to systematically analyze the effect of three patent data properties and two classification details on the patent classificati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(3 citation statements)
references
References 44 publications
0
3
0
Order By: Relevance
“…Semantic similarity is useful in various NLP tasks such as information retrieval, machine translation, question answering, and entity resolution (Ebraheem et al, 2018; Li et al, 2019; Varelas et al, 2005; Yih et al, 2014; Zou et al, 2013). In practice, it has a wide range of real‐life applications such as patent class prediction, scientific text comparison, newspaper article similarity analysis, topic discovery for United Nations speeches, and concept derivation for encyclopedias (Geum & Kim, 2020; Gomez, 2019; Gong et al, 2019; Shalaby & Zadrozny, 2017; Watanabe & Zhou, 2020). Going beyond determining the semantic similarity as a binary decision (1 or 0, as similar or not), generating non‐binary scores of text pairs followed by ranking has become a common goal of semantic similarity analysis.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Semantic similarity is useful in various NLP tasks such as information retrieval, machine translation, question answering, and entity resolution (Ebraheem et al, 2018; Li et al, 2019; Varelas et al, 2005; Yih et al, 2014; Zou et al, 2013). In practice, it has a wide range of real‐life applications such as patent class prediction, scientific text comparison, newspaper article similarity analysis, topic discovery for United Nations speeches, and concept derivation for encyclopedias (Geum & Kim, 2020; Gomez, 2019; Gong et al, 2019; Shalaby & Zadrozny, 2017; Watanabe & Zhou, 2020). Going beyond determining the semantic similarity as a binary decision (1 or 0, as similar or not), generating non‐binary scores of text pairs followed by ranking has become a common goal of semantic similarity analysis.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Data analysis tools, such as text classification models, can be used to put the data source selected into proper layers of data-driven TRM. Classification models such as support vector machine (SVM) [86], k-nearest neighbor (KNN) [87,88], Hidden Markov [89], and Bayesian [44,90] can be employed.…”
Section: Bidirectional Encoder Representations For Transformers With ...mentioning
confidence: 99%
“…Some focused on the best way to represent the patent text and how to extract semantic features from it (D'hondt et al, 2013;Shalaby et al, 2018;Hu et al, 2018a;Hu et al, 2018b;Li et al, 2018) while others focused on designing more effective classification algorithms (Fall et al, 2003;Al Shamsi & Aung, 2016;D'hondt et al, 2017;Wu et al, 2010Wu et al, , 2016Song et al, 2019). Furthermore, some attempts have been made to find which part of the patent text can be more representative and provide better classification results (Gomez, 2019;Hu et al, 2018a;Wu et al, 2010;D'hondt & Verberne, 2010). Gomez & Moens (2014) did a comprehensive survey of several previous works that tackled the automated patent classification problem in the IPC hierarchy.…”
Section: Related Workmentioning
confidence: 99%