Extractive summarization aims to produce a concise version of a document by extracting information-rich sentences from the original texts. The graph-based model is an effective and efficient approach to rank sentences since it is simple and easy to use. However, its performance depends heavily on good text representation. In this paper, an integrated graph model (iGraph) for extractive text summarization is proposed. An enhanced embedding model is used to detect the inherent semantic properties at the word level, bigram level and trigram level. Words with part-of-speech (POS) tags, bigrams and trigrams were extracted to train the embedding models. Based on the enhanced embedding vectors, the similarity values between the sentences were calculated in three perspectives. The sentences in the document were treated as vertexes and the similarity between them as edges. As a result, three different types of semantic graphs were obtained for every document, with the same nodes and different edges. These three graphs were integrated into one enriched semantic graph in a naive Bayesian fashion. After that, TextRank, which is a graph-based ranking algorithm, was applied to rank the sentences, before the top scored sentences were selected for the summary according to the compression rate. Evaluated on the DUC 2002 and DUC 2004 datasets, our proposed method shows competitive performance compared to the state-of-the-art methods.
We present EcForest, an extractive summarization model through Enhanced Sentence Embedding and Cascade Forest. Sentence representation is of great significance for many summarization methods. Bag-of-words mostly fails to grasp the semantics, and typical embedding models cannot capture more complex semantic features, such as polysemy and the meaning of a phrase, which is usually ignored by simply averaging the word embeddings included in a sentence. To this end, we propose Enhanced Sentence Embedding (ESE) model to solve such drawbacks via mapping several valid features to dense vectors. Essentially, the enhanced sentence embedding is a novel model for improving the distributed representation of sentence. Our sentence embedding model is universally applicable and it can be adapted to other NLP tasks. Moreover, deep forest is used as a sentence extraction algorithm for its robustness to the hyper-parameters and its efficient training algorithm compared to deep neural network.The evaluation of variant models proposed in this work proves the validation of the enhanced sentence embedding. The comparison results between EcForest and several baselines on two different datasets demonstrate that the proposed summarization model performs better than or with high competitiveness to the state-of-the-art.
In this study, we examined ticket pricing and train stop planning for the high-speed railway (HSR), which integrates two key aspects of railway operation and organization. We considered that passenger demand is sensitive to the generalized travel cost (depending on the ticket price and the travel time) and that the train stop plan can affect the travel time and passenger distribution. Then, a mixed-integer non-linear optimization model was proposed for the joint problem of ticket pricing and train stop planning to maximize HSR’s transport revenue and minimize passengers’ travel time. Based on the high similarity between combinatorial optimization problems and the solid annealing principle, we designed a combined simulated annealing (CSA) algorithm to solve practical problems. The results of a numerical example in the real HSR network showed that the proposed method can improve transport revenue by 5.1% and reduce passengers’ travel time loss by 11.15% without increasing transport capacity.
Customer service text data is known as the dialogue text data between users and customer service provider, and it contains a large amount of user information. The effective use of customer service text content can bring great business plan optimization to the service provider. Based on the traditional machine reading comprehension model, this paper builds a customer service text user's attribute label recognition model, and proposes a model pre-training method based on sentence-level pre-training technology: aiming at the background of poor performance of the model in answering comprehensive full-text content analysis questions such as user intent and text sentiment analysis. This paper extracts text summaries based on the T5-pegasus model, constructing a text summaries dataset for model pre-training. Then build a text summarization model including an ERNIE pre-training model, train the model's ability to understand the full text, and improve the model's ability to answer questions that need to be combined with full-text content understanding, such as user intent and sentiment analysis. Use the pre-trained model to solve customer service text label recognition tasks based on machine reading comprehension tasks. The test results based on the data set show that the improved model has an improvement in performance of customer service text label recognition task.
This paper aims to optimize the transportation network and transportation organization strategy of Transport through China, enabling operators to obtain greater profits, improving the efficiency of transit freight transport, and solving the problem of transportation pricing and route selection of transit goods. In this paper, the growth trend of transit transport demand is firstly determined. On this basis, the ultimate goal is to maximize the transport profit of the operator. In-depth analysis is made from the perspectives of transport income and transport cost. In addition, through combing existing international transportation routes, the overall transit network map of transit China to central Asian and European countries is drawn. In order to achieve the goal of minimizing transportation expenditure, the model of comparing freight routes is established. The customer is also classified by matrix model. Finally, with the transit transportation from Japan, Korea and other countries as examples, the model in this paper is verified, and the optimal transportation path is obtained through software solution. Compared with the current scheme, it has saved operating costs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.