Codeswitching is a very common behavior among Swahili speakers, but of the little computational work done on Swahili, none has focused on codeswitching. This paper addresses two tasks relating to Swahili-English codeswitching: word-level language identification and prediction of codeswitch points. Our two-step model achieves high accuracy at labeling the language of words using a simple feature set combined with label probabilities on the adjacent words. This system is used to label a large Swahili-English internet corpus, which is in turn used to train a model for predicting codeswitch points.
This paper describes the Howard University system for the language identification shared task of the Second Workshop on Computational Approaches to Code Switching. Our system is based on prior work on Swahili-English token-level language identification. Our system primarily uses character n-gram, prefix and suffix features, letter case and special character features along with previously existing tools. These are then combined with generated label probabilities of the immediate context of the token for the final system.
Network pruning is aimed at imposing sparsity in a neural network architecture by increasing the portion of zero-valued weights for reducing its size regarding energy-efficiency consideration and increasing evaluation speed. In most of the conducted research efforts, the sparsity is enforced for network pruning without any attention to the internal network characteristics such as unbalanced outputs of the neurons or more specifically the distribution of the weights and outputs of the neurons. That may cause severe accuracy drop due to uncontrolled sparsity. In this work, we propose an attention mechanism that simultaneously controls the sparsity intensity and supervised network pruning by keeping important information bottlenecks of the network to be active. On CIFAR-10, the proposed method outperforms the best baseline method by 6% and reduced the accuracy drop by 2.6× at the same level of sparsity.
Achieving information-theoretic security using explicit coding scheme in which unlimited computational power for eavesdropper is assumed, is one of the main topics is security consideration. It is shown that polar codes are capacity achieving codes and have a low complexity in encoding and decoding. It has been proven that polar codes reach to secrecy capacity in the binary-input wiretap channels in symmetric settings for which the wiretapper's channel is degraded with respect to the main channel. The first task of this paper is to propose a coding scheme to achieve secrecy capacity in asymmetric nonbinary-input channels while keeping reliability and security conditions satisfied. Our assumption is that the wiretap channel is stochastically degraded with respect to the main channel and message distribution is unspecified. The main idea is to send information set over good channels for Bob and bad channels for Eve and send random symbols for channels that are good for both. In this scheme the frozen vector is defined over all possible choices using polar codes ensemble concept. We proved that there exists a frozen vector for which the coding scheme satisfies reliability and security conditions. It is further shown that uniform distribution of the message is the necessary condition for achieving secrecy capacity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.