Qiaoming Zhu scite author profile

Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms. The difficulty of scaling search to large datasets largely explains why most academic work on time series data mining has plateaued at considering a few millions of time series objects, while much of industry and science sits on billions of time series objects waiting to be explored. In this work we show that by using a combination of four novel ideas we can search and mine truly massive time series for the first time. We demonstrate the following extremely unintuitive fact; in large datasets we can exactly search under DTW much more quickly than the current state-of-the-art Euclidean distance search algorithms. We demonstrate our work on the largest set of time series experiments ever attempted. In particular, the largest dataset we consider is larger than the combined size of all of the time series datasets considered in all data mining papers ever published. We show that our ideas allow us to solve higher-level time series data mining problem such as motif discovery and clustering at scales that would otherwise be untenable. In addition to mining massive datasets, we will show that our ideas also have implications for real-time monitoring of data streams, allowing us to handle much faster arrival rates and/or use cheaper and lower powered devices than are currently possible.

show abstract

Exact Discovery of Time Series Motifs

Mueen

et al. 2009

View full text Add to dashboard Cite

Time series motifs are pairs of individual time series, or subsequences of a longer time series, which are very similar to each other. As with their discrete analogues in computational biology, this similarity hints at structure which has been conserved for some reason and may therefore be of interest. Since the formalism of time series motifs in 2002, dozens of researchers have used them for diverse applications in many different domains. Because the obvious algorithm for computing motifs is quadratic in the number of items, more than a dozen approximate algorithms to discover motifs have been proposed in the literature. In this work, for the first time, we show a tractable exact algorithm to find time series motifs. As we shall show through extensive experiments, our algorithm is up to three orders of magnitude faster than brute-force search in large datasets. We further show that our algorithm is fast enough to be used as a subroutine in higher level data mining algorithms for anytime classification, near-duplicate detection and summarization, and we consider detailed case studies in domains as diverse as electroencephalograph interpretation and entomological telemetry data mining.

show abstract

Automatic early warning of tail biting in pigs: 3D cameras can detect lowered tail posture before an outbreak

et al. 2018

View full text Add to dashboard Cite

Tail biting is a major welfare and economic problem for indoor pig producers worldwide. Low tail posture is an early warning sign which could reduce tail biting unpredictability. Taking a precision livestock farming approach, we used Time-of-flight 3D cameras, processing data with machine vision algorithms, to automate the measurement of pig tail posture. Validation of the 3D algorithm found an accuracy of 73.9% at detecting low vs. not low tails (Sensitivity 88.4%, Specificity 66.8%). Twenty-three groups of 29 pigs per group were reared with intact (not docked) tails under typical commercial conditions over 8 batches. 15 groups had tail biting outbreaks, following which enrichment was added to pens and biters and/or victims were removed and treated. 3D data from outbreak groups showed the proportion of low tail detections increased pre-outbreak and declined post-outbreak. Pre-outbreak, the increase in low tails occurred at an increasing rate over time, and the proportion of low tails was higher one week pre-outbreak (-1) than 2 weeks pre-outbreak (-2). Within each batch, an outbreak and a non-outbreak control group were identified. Outbreak groups had more 3D low tail detections in weeks -1, +1 and +2 than their matched controls. Comparing 3D tail posture and tail injury scoring data, a greater proportion of low tails was associated with more injured pigs. Low tails might indicate more than just tail biting as tail posture varied between groups and over time and the proportion of low tails increased when pigs were moved to a new pen. Our findings demonstrate the potential for a 3D machine vision system to automate tail posture detection and provide early warning of tail biting on farm.

show abstract

Fast Human Detection Using a Cascade of Histograms of Oriented Gradients

et al.

View full text Add to dashboard Cite

Exploiting constituent dependencies for tree kernel-based semantic relation extraction

et al. 2008

View full text Add to dashboard Cite

This paper proposes a new approach to dynamically determine the tree span for tree kernel-based semantic relation extraction. It exploits constituent dependencies to keep the nodes and their head children along the path connecting the two entities, while removing the noisy information from the syntactic parse tree, eventually leading to a dynamic syntactic parse tree. This paper also explores entity features and their combined features in a unified parse and semantic tree, which integrates both structured syntactic parse information and entity-related semantic information. Evaluation on the ACE RDC 2004 corpus shows that our dynamic syntactic parse tree outperforms all previous tree spans, and the composite kernel combining this tree kernel with a linear state-of-the-art feature-based kernel, achieves the so far best performance.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Qiaoming Zhu

Searching and mining trillions of time series subsequences under dynamic time warping

Exact Discovery of Time Series Motifs

Automatic early warning of tail biting in pigs: 3D cameras can detect lowered tail posture before an outbreak

Fast Human Detection Using a Cascade of Histograms of Oriented Gradients

Exploiting constituent dependencies for tree kernel-based semantic relation extraction

Contact Info

Product

Resources

About