Document and discourse segmentation are two fundamental NLP tasks pertaining to breaking up text into constituents, which are commonly used to help downstream tasks such as information retrieval or text summarization. In this work, we propose three transformer-based architectures and provide comprehensive comparisons with previously proposed approaches on three standard datasets. We establish a new state-of-the-art, reducing in particular the error rates by a large margin in all cases. We further analyze model sizes and find that we can build models with many fewer parameters while keeping good performance, thus facilitating real-world applications.
Early life and marriage:Franklin Delano Roosevelt was born on January 30, 1882, in the Hudson Valley town of Hyde Park, New York, to businessman James Roosevelt I and his second wife, Sara Ann Delano. (...) Aides began to refer to her at the time as "the president's girlfriend", and gossip linking the two romantically appeared in the newspapers.(...) Legacy: Roosevelt is widely considered to be one of the most important figures in the history of the United States, as well as one of the most influential figures of the 20th century. (...) Roosevelt has also appeared on several U.S. Postage stamps.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.