Learning visual representations of medical images is core to medical image understanding but its progress has been held back by the small size of hand-labeled datasets. Existing work commonly relies on transferring weights from ImageNet pretraining, which is suboptimal due to drastically different image characteristics, or rule-based label extraction from the textual report data paired with medical images, which is inaccurate and hard to generalize. We propose an alternative unsupervised strategy to learn medical visual representations directly from the naturally occurring pairing of images and textual data. Our method of pretraining medical image encoders with the paired text data via a bidirectional contrastive objective between the two modalities is domain-agnostic, and requires no additional expert input. We test our method by transferring our pretrained weights to 4 medical image classification tasks and 2 zero-shot retrieval tasks, and show that our method leads to image representations that considerably outperform strong baselines in most settings. Notably, in all 4 classification tasks, our method requires only 10% as much labeled training data as an ImageNet initialized counterpart to achieve better or comparable performance, demonstrating superior data efficiency.
We propose a novel geolocation prediction model using a complex neural network. Our model unifies text, metadata, and user network representations with an attention mechanism to overcome previous ensemble approaches. In an evaluation using two open datasets, the proposed model exhibited a maximum 3.8% increase in accuracy and a maximum of 6.6% increase in accuracy@161 against previous models. We further analyzed several intermediate layers of our model, which revealed that their states capture some statistical characteristics of the datasets.
This paper describes the system that has been used by TeamX in SemEval-2014 Task 9 Subtask B. The system is a sentiment analyzer based on a supervised text categorization approach designed with following two concepts. Firstly, since lexicon features were shown to be effective in SemEval-2013 Task 2, various lexicons and pre-processors for them are introduced to enhance lexical information. Secondly, since a distribution of sentiment on tweets is known to be unbalanced, an weighting scheme is introduced to bias an output of a machine learner. For the test run, the system was tuned towards Twitter texts and successfully achieved high scoring results on Twitter data, average F 1 70.96 on Twitter2014 and average F 1 56.50 on Twitter2014Sarcasm.
Profile inference of SNS users is valuable for marketing, target advertisement, and opinion polls. Several studies examining profile inference have been reported to date. Although information of various types is included in SNS, most such studies only use text information. It is expected that incorporating information of other types into text classifiers can provide more accurate profile inference. As described in this paper, we propose combined method of text processing and image processing to improve gender inference accuracy. By applying the simple formula to combine two results derived from a text processor and an image processor, significantly increased accuracy was confirmed.
Recently, neural models have shown superior performance over conventional models in NER tasks. These models use CNN to extract sub-word information along with RNN to predict a tag for each word. However, these models have been tested almost entirely on English texts. It remains unclear whether they perform similarly in other languages. We worked on Japanese NER using neural models and discovered two obstacles of the state-ofthe-art model. First, CNN is unsuitable for extracting Japanese sub-word information. Secondly, a model predicting a tag for each word cannot extract an entity when a part of a word composes an entity. The contributions of this work are (i) verifying the effectiveness of the state-of-theart NER model for Japanese, (ii) proposing a neural model for predicting a tag for each character using word and character information. Experimentally obtained results demonstrate that our model outperforms the state-of-the-art neural English NER model in Japanese.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.