quanteda is an R package providing a comprehensive workflow and toolkit for natural language processing tasks such as corpus management, tokenization, analysis, and visualization. It has extensive functions for applying dictionary analysis, exploring texts using keywords-in-context, computing document and feature similarities, and discovering multi-word expressions through collocation scoring. Based entirely on sparse operations, it provides highly efficient methods for compiling document-feature matrices and for manipulating these or using them in further quantitative analysis. Using C++ and multithreading extensively, quanteda is also considerably faster and more efficient than other R and Python packages in processing large textual data. . quanteda: An R package for the quantitative analysis of textual data.
A 180-km-long, high-resolution seismic-refl ection survey that imaged the entire crust and the uppermost mantle lithosphere was conducted across the northeastern Tibetan Plateau. This work had three aims: (1) to examine whether the left-slip Haiyuan and Tianjing faults defi ning the margin of NE Tibet are crustal-or lithospheric-scale structures, (2) to determine whether seismic fabrics are consistent with middle-and/ or lower-crustal channel fl ow, and (3) to establish the minimum amount of Cenozoic shortening strain in the region. Analysis of our newly obtained seismic-refl ection data suggests that the left-slip Haiyuan and Tianjing faults have multiple strands and cut through the upper and middle crust. The faults likely terminate at a low-angle detachment shear zone in the lower crust, because the fl at Moho directly below the projected traces of the faults is continuous. The seismic image displays subvertical zones of highly refl ective sequences containing parallel and subhorizontal refl ectors that are truncated by seismically transparent regions with irregular shape. The transparent regions in the middle crust are traceable to the seismically transparent lower crust and are interpreted as early Paleozoic plutons emplaced during the construction of the Qilian arc in the region. The presence of the undisturbed subvertical contacts between zones of highly refl ective and seismically transparent regions rules out the occurrence of channel fl ow in the middle crust, as this process would require through-going subhorizontal refl ectors bounding the channel above and below. The lack of continuous refl ectors longer than a few kilometers in the lower crust makes a laminar mode of channel fl ow unfavorable, but lateral lower-crustal fl ow could have occurred via small-scale ductile deformation involving folding (less than a few kilometers in wavelength and amplitude). Integrating surface geology and the seismic data, we fi nd that the upper crust along a segment of the seismic surveying line experienced up to 46% crustal shortening postdating the Cretaceous and is thus interpreted as entirely accumulated in the Cenozoic. If the estimated shortening strain is representative across northeastern Tibet, its magnitude is suffi cient to explain the current elevation of the region without an appeal for additional contributing factors such as channel fl ow and/or a thermal event in the upper mantle.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.