Sign language conveys information through multiple channels, such as hand shape, hand movement, and mouthing. Modeling this multichannel information is a highly challenging problem. In this paper, we elucidate the link between spoken language and sign language in terms of production phenomenon and perception phenomenon. Through this link we show that hidden Markov model-based approaches developed to model "articulatory" features for spoken language processing can be exploited to model the multichannel information inherent in sign language for sign language processing.
Communication languages convey information through the use of a set of symbols or units. Typically, this unit is word. When developing language technologies, as words in a language do not have the same prior probability, there may not be sufficient training data for each word to model. Furthermore, the training data may not cover all possible words in the language. Due to these data sparsity and word unit coverage issues, language technologies employ modeling of subword units or subunits, which are based on prior linguistic knowledge. For instance, development of speech technologies such as automatic speech recognition system presume that there exists a phonetic dictionary or at least a writing system for the target language. Such knowledge is not available for all languages in the world. In that direction, this article develops a hidden Markov model-based abstract methodology to extract subword units given only pairwise comparison between utterances (or realizations of words in the mode of communication), i.e., whether two utterances correspond to the same word or not. We validate the proposed methodology through investigations on spoken language and sign language. In the case of spoken language, we demonstrate that the proposed methodology can lead up to discovery of phone set and development of phonetic dictionary. In the case of sign language, we demonstrate how hand movement information can be effectively modeled for sign language processing and synthesized back to gain insight about the derived subunits.
Sign language recognition involves modeling of multichannel information such as, hand shapes, hand movements. This requires also sufficient sign language specific data. This is a challenge as sign languages are inherently under-resourced. In the literature, it has been shown that hand shape information can be estimated by pooling resources from multiple sign languages. Such a capability does not exist yet for modeling hand movement information. In this paper, we develop a multilingual sign language approach, where hand movement modeling is also done with target sign language independent data by derivation of hand movement subunits. We validate the proposed approach through an investigation on Swiss German Sign Language, German Sign Language and Turkish Sign Language, and demonstrate that sign language recognition systems can be effectively developed by using multilingual sign language resources.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.