Topic models such as latent Dirichlet allocation (LDA) and hierarchical Dirichlet processes (HDP) are simple solutions to discover topics from a set of unannotated documents. While they are simple and popular, a major shortcoming of LDA and HDP is that they do not organize the topics into a hierarchical structure which is naturally found in many datasets. We introduce the recursive Chinese restaurant process (rCRP) and a nonparametric topic model with rCRP as a prior for discovering a hierarchical topic structure with unbounded depth and width. Unlike previous models for discovering topic hierarchies, rCRP allows the documents to be generated from a mixture over the entire set of topics in the hierarchy. We apply rCRP to a corpus of New York Times articles, a dataset of MovieLens ratings, and a set of Wikipedia articles and show the discovered topic hierarchies. We compare the predictive power of rCRP with LDA, HDP, and nested Chinese restaurant process (nCRP) using heldout likelihood to show that rCRP outperforms the others. We suggest two metrics that quantify the characteristics of a topic hierarchy to compare the discovered topic hierarchies of rCRP and nCRP. The results show that rCRP discovers a hierarchy in which the topics become more specialized toward the leaves, and topics in the immediate family exhibit more affinity than topics beyond the immediate family.
Given the need for stretchable sensors, many studies have been conducted on eutectic gallium-indium, which has superior properties as a conductive ink. However, it has remained a challenge to manufacture sensors in a consistent and reproducible manner because conventional mold-based fabrication still depends highly on manual techniques. To overcome this limitation, the direct ink writing was used in this study, focusing on improving the stability of writing by exploring issues related to failure and ensuring the consistency of the microchannel by selecting appropriate process variables, including the syringe material. As a result, multiple sensors produced under the same manufacturing conditions had similar behaviors. This fabrication technique improved the accuracy of manufacturing a microchannel, and its behavior was predicted successfully by a simple mathematical model, which was confirmed by nondestructive inspections of the microchannel. In developing a one-piece glove-type sensor without an assembly process, the efficiency of the fabrication technique was also emphasized.
In this study, a soft sensor-based three-dimensional (3-D) finger motion measurement system is proposed. The sensors, made of the soft material Ecoflex, comprise embedded microchannels filled with a conductive liquid metal (EGaln). The superior elasticity, light weight, and sensitivity of soft sensors allows them to be embedded in environments in which conventional sensors cannot. Complicated finger joints, such as the carpometacarpal (CMC) joint of the thumb are modeled to specify the location of the sensors. Algorithms to decouple the signals from soft sensors are proposed to extract the pure flexion, extension, abduction, and adduction joint angles. The performance of the proposed system and algorithms are verified by comparison with a camera-based motion capture system.
In a multilingual society, language not only reflects culture and heritage, but also has implications for social status and the degree of integration in society. Different languages can be a barrier between monolingual communities, and the dynamics of language choice could explain the prosperity or demise of local languages in an international setting. We study this interplay of language and network structure in diverse, multi-lingual societies, using Twitter. In our analysis, we are particularly interested in the role of bilinguals. Concretely, we attempt to quantify the degree to which users are the "bridge-builders" between monolingual language groups, while monolingual users cluster together. Also, with the revalidation of English as a lingua franca on Twitter, we reveal users of the native non-English language have higher influence than English users, and the language convergence pattern is consistent across the regions. Furthermore, we explore for which topics these users prefer their native language rather than English. To the best of our knowledge, this is the largest sociolinguistic study in a network setting.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.