Abstract-A time series is a sequence of observations collected over fixed sampling intervals. Several real-world dynamic processes can be modeled as a time series, such as stock price movements, exchange rates, temperatures, among others. As a special kind of data stream, a time series may present concept drift, which affects negatively time series analysis and forecasting. Explicit drift detection methods based on monitoring the time series features may provide a better understanding of how concepts evolve over time than methods based on monitoring the forecasting error of a base predictor. In this paper, we propose an online explicit drift detection method that identifies concept drifts in time series by monitoring time series features, called Feature Extraction for Explicit Concept Drift Detection (FEDD). Computational experiments showed that FEDD performed better than error-based approaches in several linear and nonlinear artificial time series with abrupt and gradual concept drifts.
The use of features extracted using a deep convolutional neural network (CNN) combined with a writer-dependent (WD) SVM classifier resulted in significant improvement in performance of handwritten signature verification (HSV) when compared to the previous state-of-the-art methods. In this work it is investigated whether the use of these CNN features provide good results in a writer-independent (WI) HSV context, based on the dichotomy transformation combined with the use of an SVM writer-independent classifier. The experiments performed in the Brazilian and GPDS datasets show that (i) the proposed approach outperformed other WI-HSV methods from the literature, (ii) in the global threshold scenario, the proposed approach was able to outperform the writer-dependent method with CNN features in the Brazilian dataset, (iii) in an user threshold scenario, the results are similar to those obtained by the writer-dependent method with CNN features.
The performance of classification models can be negatively impacted if the data on which they are trained contains very rare events. While recent research has investigated the issue of class imbalance, few if any studies address issues related to the handling of extreme imbalance (rare events), where the minority class can account for as little as 0.1% of the training data. This work investigates the effect of dataset size and class distribution on classification performance when examples from the minority class are rare. In addition, we compare the performance improvement achieved by acquiring additional examples to that of applying data sampling. Our results demonstrate that data sampling is very effective at alleviating the problem of rare events.
Concept drift is a change in the joint probability distribution of the problem. This term can be subdivided into two types: real drifts that affect the conditional probabilities p(y|x) or virtual drifts that affect the unconditional probability distribution p(x). Most existing work focuses on dealing with real concept drifts. However, virtual drifts can also cause degradation in predictive performance, requiring mechanisms to be tackled. Moreover, as virtual drifts frequently mean that part of the old knowledge remains useful, they require different strategies from real drifts to be effectively tackled. Motivated on this, we propose an approach called Gaussian Mixture Model for Dealing With Virtual and Real Concept Drifts (GMM-VRD), which updates and creates Gaussians to tackle virtual drifts and resets the system to deal with real drifts. The main results show that the proposed approach obtained the best results, in terms of average accuracy, in relation to the literature methods, which propose to solve that same problem. In terms of accuracy over time, the proposed approach showed lower degradation on concept drifts, which indicates that the proposed approach was efficient.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.