In this study, we propose an alternative approach to analyzing a domain-specific time series corpus for detecting word evolution. The method trains a target corpus in time series into a temporal word embedding (TWE) model. The advantage of TWE is that one can see how the meaning of a word changes over time. We have chosen the TWEC approach to model a Malay domain-specific time-series corpus, the Malaysian Hansard Corpus (MHC), to a TWE model and called the model as MHC-TWEC. Two primary analyses, i.e., self-similarity analysis and user-defined method analysis, were performed to validate the effectiveness of the MHC-TWEC model in quantifying semantic shift on MHC visually. From those analyses, we visually find out that the TWE model can capture the semantic shift in the temporal corpus (the MHC).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.