Recent years, Chinese text classification has attracted more and more research attention. However, most existing techniques which specifically aim at English materials may lose effectiveness on this task due to the huge difference between Chinese and English. Actually, as a special kind of hieroglyphics, Chinese characters and radicals are semantically useful but still unexplored in the task of text classification. To that end, in this paper, we first analyze the motives of using multiple granularity features to represent a Chinese text by inspecting the characteristics of radicals, characters and words. For better representing the Chinese text and then implementing Chinese text classification, we propose a novel Radicalaware Attention-based Four-Granularity (RAFG) model to take full advantages of Chinese characters, words, characterlevel radicals, word-level radicals simultaneously. Specifically, RAFG applies a serialized BLSTM structure which is context-aware and able to capture the long-range information to model the character sharing property of Chinese and sequence characteristics in texts. Further, we design an attention mechanism to enhance the effects of radicals thus model the radical sharing property when integrating granularities. Finally, we conduct extensive experiments, where the experimental results not only show the superiority of our model, but also validate the effectiveness of radicals in the task of Chinese text classification.
Crowdfunding is an emerging mechanism for entrepreneurs or individuals to solicit funding from the public for their creative ideas. However, in these platforms, quite a large proportion of campaigns (projects) fail to raise enough money of backers’ supports by the declared expiration date. Actually, it is very urgent to predict the exact success time of campaigns. But this problem has not been well explored due to a series of domain and technical challenges. In this paper, we notice the implicit factor of distribution of backing behaviors has a positive impact on estimating the success time of the campaign. Therefore, we present a focused study on predicting two specific tasks, i.e., backing distribution prediction and success time prediction of campaigns. Specifically, we propose a Seq2seq based model with Multi-facet Priors (SMP), which can integrate heterogeneous features to jointly model the backing distribution and success time. Additionally, to keep the change of backing distributions more smooth as the backing behaviors increases, we develop a linear evolutionary prior for backing distribution prediction. Furthermore, due to high failure rate, the success time of most campaigns is unobservable. We model this censoring phenomenon from the survival analysis perspective and also develop a non-increasing prior and a partial prior for success time prediction. Finally, we conduct extensive experiments on a real-world dataset from Indiegogo. Experimental results clearly validate the effectiveness of SMP.
In the area of community question answering (CQA), answer selection and answer ranking are two tasks which are applied to help users quickly access valuable answers. Existing solutions mainly exploit the syntactic or semantic correlation between a question and its related answers (Q&A), where the multifacet domain effects in CQA are still underexplored. In this paper, we propose a unified model, enhanced attentive recurrent neural network (EARNN), for both answer selection and answer ranking tasks by taking full advantages of both Q&A semantics and multifacet domain effects (i.e., topic effects and timeliness). Specifically, we develop a serialized long short-term memory to learn the unified representations of Q&A, where two attention mechanisms at either sentence level or word level are designed for capturing the deep effects of topics. Meanwhile, the emphasis of Q&A can be automatically distinguished. Furthermore, we design a time-sensitive ranking function to model the timeliness in CQA. To effectively train EARNN, a question-dependent pairwise learning strategy is also developed. Finally, we conduct extensive experiments on a realworld dataset from Quora. Experimental results validate the effectiveness and interpretability of our proposed EARNN model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.