We present a new keyword extraction algorithm that applies to a single document without using a corpus. Frequent terms are extracted first, then a set of co-occurrences between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. Co-occurrence distribution shows importance of a term in the document as follows. If the probability distribution of co-occurrence between term a and the frequent terms is biased to a particular subset of frequent terms, then term a is likely to be a keyword. The degree of bias of a distribution is measured by the χ 2 -measure. Our algorithm shows comparable performance to tfidf without using a corpus.
Liver failure is a key determinant influencing the natural history of hepatocellular carcinoma (HCC). In this large multi-centre study we externally validate a novel biomarker of liver functional reserve, the ALBI grade, across all the stages of HCC.
Discourse structures have a central role in several computational tasks, such as
question-answering or dialogue generation. In particular, the framework of the
Rhetorical Structure Theory (RST) offers a sound formalism for hierarchical text
organization. In this article, we present HILDA, an implemented discourse parser based
on RST and Support Vector Machine (SVM) classification. SVM classifiers are trained and
applied to discourse segmentation and relation labeling. By combining labeling with a
greedy bottom-up tree building approach, we are able to create accurate discourse trees
in linear time complexity. Importantly, our parser can parse entire texts, whereas the
publicly available parser SPADE (Soricut and Marcu 2003) is limited to sentence level
analysis. HILDA outperforms other discourse parsers for tree structure construction and
discourse relation labeling. For the discourse parsing task, our system reaches 78.3% of
the performance level of human annotators. Compared to a state-of-the-art rule-based
discourse parser, our system achieves a performance increase of 11.6%.
CAR is as useful for predicting the postoperative survival of patients with CRC as previously reported inflammation-based prognostic systems, such as GPS and NLR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.