Xiangnan Ren scite author profile

Xiangnan Ren

3Publications

34Citation Statements Received

94Citation Statements Given

How they've been cited

How they cite others

Affiliations

Beijing Institute of Nutritional Sources, Atos (France), Nankai University

Publications

Order By: Most citations

Strider: A Hybrid Adaptive Distributed RDF Stream Processing Engine

Ren

Curé

2017

View full text Add to dashboard Cite

Real-time processing of data streams emanating from sensors is becoming a common task in Internet of Things scenarios. The key implementation goal consists in efficiently handling massive incoming data streams and supporting advanced data analytics services like anomaly detection. In an on-going, industrial project, we found out that a 24/7 available stream processing engine usually faces dynamically changing data and workload characteristics. These changes impact the engine's performance and reliability. We propose Strider, a hybrid adaptive distributed RDF Stream Processing engine that optimizes logical query plan according to the state of data streams. Strider has been designed to guarantee important industrial properties such as scalability, high availability, fault-tolerant, high throughput and acceptable latency. These guarantees are obtained by designing the engine's architecture with state-of-the-art Apache components such as Spark and Kafka. We highlight the efficiency (e.g., on a single machine machine, up to 60x gain on throughput compared to state-of-the-art systems, a throughput of 3.1 million triples/second on a 9 machines cluster, a major breakthrough in this system's category) of Strider on real-world and synthetic data sets.

show abstract

BigSR: real-time expressive RDF stream reasoning on modern Big Data platforms

Ren¹,

Curé²,

Naacke³

et al. 2018

View full text Add to dashboard Cite

The trade-off between language expressiveness and system scalability (E&S) is a well-known problem in RDF stream reasoning. Higher expressiveness supports more complex reasoning logic, however, it may also hinder system scalability. Current research mainly focuses on logical frameworks suitable for stream reasoning as well as the implementation and the evaluation of prototype systems. These systems are normally developed in a centralized setting which suffer from inherent limited scalability, while an in-depth study of applying distributed solutions to cover E&S is still missing. In this paper, we aim to explore the feasibility of applying modern distributed computing frameworks to meet E&S all together. To do so, we first propose BigSR, a technical demonstrator that supports a positive fragment of the LARS framework. For the sake of generality and to cover a wide variety of use cases, BigSR relies on the two main execution models adopted by major distributed execution frameworks: Bulk Synchronous Processing (BSP) and Record-at-A-Time (RAT). Accordingly, we implement BigSR on top of Apache Spark Streaming (BSP model) and Apache Flink (RAT model). In order to conclude on the impacts of BSP and RAT on E&S, we analyze the ability of the two models to support distributed stream reasoning and identify several types of use cases characterized by their levels of support. This classification allows for quantifying the E&S trade-off by assessing the scalability of each type of use case w.r.t. its level of expressiveness. Then, we conduct a series of experiments with 15 queries from 4 different datasets. Our experiments show that BigSR over both BSP and RAT generally scales up to high throughput beyond million-triples per second (with or without recursion), and RAT attains sub-millisecond delay for stateless query operators.

show abstract

Sequence Labeling with Meta-Learning

Li¹,

Han²,

Ren³

et al. 2021

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Recent neural architectures in sequence labeling have yielded state-of-the-art performance on single domain data such as newswires. However, they still suffer from (i) requiring massive amounts of training data to avoid overfitting; (ii) huge performance degradation when there is a domain shift in the data distribution between training and testing. To make a sequence labeling system more broadly useful, it is crucial to reduce its training data requirements and transfer knowledge to other domains. In this paper, we investigate the problem of domain adaptation for sequence labeling under homogeneous and heterogeneous settings. We propose METASEQ, a novel meta-learning approach for domain adaptation in sequence labeling. Specifically, METASEQ incorporates meta-learning and adversarial training strategies to encourage robust, general and transferable representations for sequence labeling.The key advantage of METASEQ is that it is capable of adapting to new unseen domains with a small amount of annotated data from those domains. We extensively evaluate METASEQ on named entity recognition, part-of-speech tagging and slot filling under homogeneous and heterogeneous settings. The experimental results show that METASEQ achieves state-of-the-art performance against eight baselines. Impressively, METASEQ surpasses the in-domain performance using only 16.17% and 7% of target domain data on average for homogeneous settings, and 34.76%, 24%, 22.5% of target domain data on average for heterogeneous settings.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xiangnan Ren

Strider: A Hybrid Adaptive Distributed RDF Stream Processing Engine

BigSR: real-time expressive RDF stream reasoning on modern Big Data platforms

Sequence Labeling with Meta-Learning

Contact Info

Product

Resources

About