Haoran Shi scite author profile

Haoran Shi

5Publications

56Citation Statements Received

116Citation Statements Given

How they've been cited

110

How they cite others

186

116

Affiliations

Naval University of Engineering, Shandong University of Technology, Peking University

Publications

Order By: Most citations

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

Shi

Tan

et al. 2019

View full text Add to dashboard Cite

We introduce Texar, an open-source toolkit aiming to support the broad set of text generation tasks that transform any inputs into natural language, such as machine translation, summarization, dialog, content manipulation, and so forth. With the design goals of modularity, versatility, and extensibility in mind, Texar extracts common patterns underlying the diverse tasks and methodologies, creates a library of highly reusable modules, and allows arbitrary model architectures and algorithmic paradigms. In Texar, model architecture, inference, and learning processes are properly decomposed. Modules at a high concept level can be freely assembled and plugged in/swapped out. The toolkit also supports a rich set of large-scale pretrained models. Texar is thus particularly suitable for researchers and practitioners to do fast prototyping and experimentation. The versatile toolkit also fosters technique sharing across different text generation tasks. Texar supports both TensorFlow and PyTorch, and is released under Apache License 2.0 at https://www.texar.io.

show abstract

Large-Scale Frequent Episode Mining from Complex Event Sequences with Hierarchies

Shi

Wang

et al. 2019

ACM Trans. Intell. Syst. Technol.

View full text Add to dashboard Cite

Frequent Episode Mining (FEM), which aims at mining frequent sub-sequences from a single long event sequence, is one of the essential building blocks for the sequence mining research field. Existing studies about FEM suffer from unsatisfied scalability when faced with complex sequences as it is an NP-complete problem for testing whether an episode occurs in a sequence. In this article, we propose a scalable, distributed framework to support FEM on “big” event sequences. As a rule of thumb, “big” illustrates an event sequence is either very long or with masses of simultaneous events. Meanwhile, the events in this article are arranged in a predefined hierarchy. It derives some abstractive events that can form episodes that may not directly appear in the input sequence. Specifically, we devise an event-centered and hierarchy-aware partitioning strategy to allocate events from different levels of the hierarchy into local processes. We then present an efficient special-purpose algorithm to improve the local mining performance. We also extend our framework to support maximal and closed episode mining in the context of event hierarchy, and to the best of our knowledge, we are the first attempt to define and discover hierarchy-aware maximal and closed episodes. We implement the proposed framework on Apache Spark and conduct experiments on both synthetic and real-world datasets. Experimental results demonstrate the efficiency and scalability of the proposed approach and show that we can find practical patterns when taking event hierarchies into account.

show abstract

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

Shi

Tan

et al. 2018

Preprint

View full text Add to dashboard Cite

A novel near-infrared trifluoromethyl heptamethine cyanine dye with mitochondria-targeting for integration of collaborative treatment of photothermal and sonodynamic therapy

Shi

Tan

Wang

et al. 2022

Materials Today Advances

View full text Add to dashboard Cite

Texar: A Modularized, Versatile, and Extensible Toolbox for Text Generation

Hu¹,

Yang²,

Zhao³

et al. 2018

View full text Add to dashboard Cite

We introduce Texar, an open-source toolkit aiming to support the broad set of text generation tasks. Different from many existing toolkits that are specialized for specific applications (e.g., neural machine translation), Texar is designed to be highly flexible and versatile. This is achieved by abstracting the common patterns underlying the diverse tasks and methodologies, creating a library of highly reusable modules and functionalities, and enabling arbitrary model architectures and various algorithmic paradigms. The features make Texar particularly suitable for technique sharing and generalization across different text generation applications. The toolkit emphasizes heavily on extensibility and modularized system design, so that components can be freely plugged in or swapped out. We conduct extensive experiments and case studies to demonstrate the use and advantage of the toolkit.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Haoran Shi

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

Large-Scale Frequent Episode Mining from Complex Event Sequences with Hierarchies

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

A novel near-infrared trifluoromethyl heptamethine cyanine dye with mitochondria-targeting for integration of collaborative treatment of photothermal and sonodynamic therapy

Texar: A Modularized, Versatile, and Extensible Toolbox for Text Generation

Contact Info

Product

Resources

About