Most real systems consist of a large number of interacting, multi-typed components, while most contemporary researches model them as homogeneous networks, without distinguishing different types of objects and links in the networks. Recently, more and more researchers begin to consider these interconnected, multi-typed data as heterogeneous information networks, and develop structural analysis approaches by leveraging the rich semantic meaning of structural types of objects and links in the networks. Compared to widely studied homogeneous network, the heterogeneous information network contains richer structure and semantic information, which provides plenty of opportunities as well as a lot of challenges for data mining. In this paper, we provide a survey of heterogeneous information network analysis. We will introduce basic concepts of heterogeneous information network analysis, examine its developments on different data mining tasks, discuss some advanced topics, and point out some future research directions.
Index Termsheterogeneous information network, data mining, semi-structural data, meta path
Carbon
dot is a type of carbon material with an ultrasmall size
of less than 10 nm for all three dimensions, which has attracted more
and more attention due to its useful merits. Unfortunately, the complicated
synthesis method and low yield largely limit its wide large-scale
application. Herein, an inexpensive and high-efficiency aldol condensation
method under ambient temperature and pressure was proposed for the
large-scale synthesis of CDs, which can obtain products with 1.083
kg in 2 h and realize the functionalization of carbon dots doped with
nitrogen (NCDs) and sulfur/nitrogen doubly (NSCDs), and then the mechanism
and structure of CDs formation were explained. Moreover, utilizing
the feature of controllable assembly of carbon dots, and combined
with theoretical calculations, we have designed functionalized 1D
carbon fibers (CF) to construct high-performance potassium storage
anode materials through the assembly of carbon dots induced by a Zn
compound. Benefitting from the microstructure and surface functional
groups derived from CDs, the N-doped CF (NCF700) exhibits superior
electrochemical energy storage performance for potassium ion batteries
(PIBs). This study provides a low-cost and high-yield method to produce
CDs and promotes the practical application of CDs in electrochemical
energy storage.
We propose a new task, called Story Visualization. Given a multi-sentence paragraph, the story is visualized by generating a sequence of images, one for each sentence. In contrast to video generation, story visualization focuses less on the continuity in generated images (frames), but more on the global consistency across dynamic scenes and characters -a challenge that has not been addressed by any singleimage or video generation methods. We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework. Our model is unique in that it consists of a deep Context Encoder that dynamically tracks the story flow, and two discriminators at the story and image levels, to enhance the image quality and the consistency of the generated sequences. To evaluate the model, we modified existing datasets to create the CLEVR-SV and Pororo-SV datasets. Empirically, Story-GAN outperforms state-of-the-art models in image quality, contextual consistency metrics, and human evaluation.
Written text often provides sufficient clues to identify the author, their gender, age, and other important attributes. Consequently, the authorship of training and evaluation corpora can have unforeseen impacts, including differing model performance for different user groups, as well as privacy implications. In this paper, we propose an approach to explicitly obscure important author characteristics at training time, such that representations learned are invariant to these attributes. Evaluating on two tasks, we show that this leads to increased privacy in the learned representations, as well as more robust models to varying evaluation conditions, including out-of-domain corpora.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.