2021
DOI: 10.1007/s11192-021-04179-4
|View full text |Cite
|
Sign up to set email alerts
|

PatentNet: multi-label classification of patent documents using deep learning based language understanding

Abstract: Patent classification is an expensive and time-consuming task that has conventionally been performed by domain experts. However, the increase in the number of filed patents and the complexity of the documents make the classification task challenging. The text used in patent documents is not always written in a way to efficiently convey knowledge. Moreover, patent classification is a multi-label classification task with a large number of labels, which makes the problem even more complicated. Hence, automating t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 34 publications
(11 citation statements)
references
References 39 publications
0
11
0
Order By: Relevance
“…Multiple ML algorithms and NLP techniques are evaluated, with support vector machine (SVM) classifiers achieving over 96% accuracy in multi-class resume classification. Aroush et al [25] use deep learning and already-trained language models to handle the complex and timeconsuming task of patent classification. Using datasets like USPTO-2M and M-patent, their work investigates the fine-tuning of models like BERT, XLNet, RoBERTa, and ELECTRA to achieve cutting-edge performance on multi-label patent categorization.…”
Section: Related Workmentioning
confidence: 99%
“…Multiple ML algorithms and NLP techniques are evaluated, with support vector machine (SVM) classifiers achieving over 96% accuracy in multi-class resume classification. Aroush et al [25] use deep learning and already-trained language models to handle the complex and timeconsuming task of patent classification. Using datasets like USPTO-2M and M-patent, their work investigates the fine-tuning of models like BERT, XLNet, RoBERTa, and ELECTRA to achieve cutting-edge performance on multi-label patent categorization.…”
Section: Related Workmentioning
confidence: 99%
“…Language recognition relies heavily on Natural Language Processing (NLP), which is used by Google and Apple's Siri. The ability of technology to recognize human natural language textual data and speech-based commands depends on two fundamental elements: natural language understanding (NLU) and natural language generation (NLG) [20].…”
Section: Main Conceptsmentioning
confidence: 99%
“…Zhang and Yu (2020) use word2vec to form word vectors for patent documents and use a linear algorithm to the semantic representations of patent phrases. Haghighian Roudsari et al. (2021) perform a pre-training model BERT on patent text for the subsequent classification task.…”
Section: Related Workmentioning
confidence: 99%