Shaohan Huang scite author profile

Automatic information extraction from identity documents is a fundamental task in digital processes such as onboarding, requesting products, identity validation, among others. The information extraction process consists of identifying, locating, classifying and recognizing text of the corresponding key fields that an identity document contains. In the case of identity documents, key fields are: names, last names, document number, dates, among others.The information extraction problem has been traditionally solved using rule based algorithms and classic OCR engines. In the last few years there have been implementations based on machine learning models, using NLP (natural language processing) and CV (computer vision) to solve the problem in a more flexible and efficient way (Subramani et al., 2020). This work proposes to solve the problem of information extraction with an object detection approach. An object detection model based on transformers (Carion et al., 2020) was implemented, trained and evaluated. A solution with above 95% accuracy in detecting key fields on identification documents was achieved.

show abstract

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

Wang¹,

Bao²,

Huang³

et al. 2021

236

180

View full text Add to dashboard Cite

We generalize deep self-attention distillation in MINILM (Wang et al., 2020) by only using self-attention relation distillation for taskagnostic compression of pretrained Transformers. In particular, we define multi-head selfattention relations as scaled dot-product between the pairs of query, key, and value vectors within each self-attention module. Then we employ the above relational knowledge to train the student model. Besides its simplicity and unified principle, more favorably, there is no restriction in terms of the number of student's attention heads, while most previous work has to guarantee the same head number between teacher and student. Moreover, the fine-grained self-attention relations tend to fully exploit the interaction knowledge learned by Transformer. In addition, we thoroughly examine the layer selection strategy for teacher models, rather than just relying on the last layer as in MINILM. We conduct extensive experiments on compressing both monolingual and multilingual pre-trained models. Experimental results demonstrate that our models 1 distilled from base-size and large-size teachers (BERT, RoBERTa and XLM-R) outperform the state-of-the-art. * Contact person. 1 Distilled models and code will be publicly available at https://aka.ms/minilm.

show abstract

Neural Document Summarization by Jointly Learning to Score and Select Sentences

Zhou¹,

Yang²,

Wei³

et al. 2018

287

177

View full text Add to dashboard Cite

Sentence scoring and sentence selection are two main steps in extractive document summarization systems. However, previous works treat them as two separated subtasks. In this paper, we present a novel end-to-end neural network framework for extractive document summarization by jointly learning to score and select sentences. It first reads the document sentences with a hierarchical encoder to obtain the representation of sentences. Then it builds the output summary by extracting sentences one by one. Different from previous methods, our approach integrates the selection strategy into the scoring model, which directly predicts the relative importance given previously selected sentences. Experiments on the CNN/Daily Mail dataset show that the proposed framework significantly outperforms the state-of-the-art extractive summarization models.

show abstract

Learning to Generate Product Reviews from Attributes

Dong¹,

Huang²,

Wei³

et al. 2017

156

162

View full text Add to dashboard Cite

Automatically generating product reviews is a meaningful, yet not well-studied task in sentiment analysis. Traditional natural language generation methods rely extensively on hand-crafted rules and predefined templates. This paper presents an attention-enhanced attribute-to-sequence model to generate product reviews for given attribute information, such as user, product, and rating. The attribute encoder learns to represent input attributes as vectors. Then, the sequence decoder generates reviews by conditioning its output on these vectors. We also introduce an attention mechanism to jointly generate reviews and align words with input attributes. The proposed model is trained end-to-end to maximize the likelihood of target product reviews given the attributes. We build a publicly available dataset for the review generation task by leveraging the Amazon book reviews and their metadata. Experiments on the dataset show that our approach outperforms baseline methods and the attention mechanism significantly improves the performance of our model.

show abstract

SuperAgent: A Customer Service Chatbot for E-commerce Websites

et al. 2017

View full text Add to dashboard Cite

Conventional customer service chatbots are usually based on human dialogue, yet significant issues in terms of data scale and privacy. In this paper, we present SuperAgent, a customer service chatbot that leverages large-scale and publicly available ecommerce data. Distinct from existing counterparts, SuperAgent takes advantage of data from in-page product descriptions as well as user-generated content from ecommerce websites, which is more practical and cost-effective when answering repetitive questions, freeing up human support staff to answer much higher value questions. We demonstrate SuperAgent as an add-on extension to mainstream web browsers and show its usefulness to user's online shopping experience.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shaohan Huang

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

Neural Document Summarization by Jointly Learning to Score and Select Sentences

Learning to Generate Product Reviews from Attributes

SuperAgent: A Customer Service Chatbot for E-commerce Websites

Contact Info

Product

Resources

About