A new approach is proposed for registering a set of histological coronal two-dimensional images of a rat brain sectional material with coronal sections of a three-dimensional brain atlas, an intrinsic step and a significant challenge to current efforts in brain mapping and multimodal fusion of experimental data. The alignment problem is based on matching external contours of the brain sections, and operates in the presence of tissue distortion and tears which are routinely encountered, and possible scale, rotation, and shear changes (the affine and weak perspective groups). It is based on a novel set of local absolute affine invariants derived from the set of ordered inflection points on the external contour represented by a cubic B-spline curve. The inflection points are local intrinsic geometric features, which are preserved under both the affine and the weak perspective transformations. The invariants are constructed from the sequence of area patches bounded by the contour and the line connecting two consecutive inflection points, and hence do make direct use of the area (volume) invariance property associated with the affine transformation. These local absolute invariants are very well suited to handle the tissue distortion and tears (occlusion problem).
Named entity recognition (NER) is a fundamental task in many natural language processing (NLP) applications, such as text summarization and semantic information retrieval. Recently, deep neural networks (NNs) with the attention mechanism yield excellent performance in NER by taking advantage of character-level and word-level representation learning. In this paper, we propose a deep context-aware bidirectional long short-term memory (CaBiLSTM) model for the Sindhi NER task. The model relies upon contextual representation learning (CRL), bidirectional encoder, self-attention, and sequential conditional random field (CRF). The CaBiLSTM model incorporates task-oriented CRL based on joint character-level and word-level representations. It takes character-level input to learn the character representations. Afterwards, the character representations are transformed into word features, and the bidirectional encoder learns the word representations. The output of the final encoder is fed into the self-attention through a hidden layer before decoding. Finally, we employ the CRF for the prediction of label sequences. The baselines and the proposed CaBiLSTM model are compared by exploiting pretrained Sindhi GloVe (SdGloVe), Sindhi fastText (SdfastText), task-oriented, and CRL-based word representations on the recently proposed SiNER dataset. Our proposed CaBiLSTM model achieved a high F1-score of 91.25% on the SiNER dataset with CRL without relying on additional handmade features, such as hand-crafted rules, gazetteers, or dictionaries.
Clustering short text streams is a challenging task due to its unique properties: infinite length, sparse data representation and cluster evolution. Existing approaches often exploit short text streams in a batch way. However, determine the optimal batch size is usually a difficult task since we have no prior knowledge when the topics evolve. In addition, traditional independent word representation in the graphical model tends to cause "term ambiguity" problem in short text clustering. Therefore, in this paper, we propose an Online Semantic-enhanced Dirichlet Model for short text stream clustering, called OSDM, which integrates the word-occurrence semantic information (i.e., context) into a new graphical model and clusters for each arriving short text automatically in an online way. Extensive results have demonstrated that OSDM gives better performance compared to many state-ofthe-art algorithms on both synthetic and realworld data sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.