We are concerned with the issue of detecting changes and their signs from a data stream. For example, when given time series of COVID-19 cases in a region, we may raise early warning signals of an epidemic by detecting signs of changes in the data. We propose a novel methodology to address this issue. The key idea is to employ a new information-theoretic notion, which we call the differential minimum description length change statistics (D-MDL), for measuring the scores of change sign. We first give a fundamental theory for D-MDL. We then demonstrate its effectiveness using synthetic datasets. We apply it to detecting early warning signals of the COVID-19 epidemic using time series of the cases for individual countries. We empirically demonstrate that D-MDL is able to raise early warning signals of events such as significant increase/decrease of cases. Remarkably, for about $$64\%$$
64
%
of the events of significant increase of cases in studied countries, our method can detect warning signals as early as nearly six days on average before the events, buying considerably long time for making responses. We further relate the warning signals to the dynamics of the basic reproduction number R0 and the timing of social distancing. The results show that our method is a promising approach to the epidemic analysis from a data science viewpoint.
X-ray computed tomography (CT) allows for visualization of the interior of solid objects in a non-destructive and non-invasive
manner. However, producing high-precision measurements takes a long time because thousands of sharp transmission images
are required to reconstruct CT volumes. To address this problem, we propose a CT measurement method based on Convolutional
Neural Networks (CNNs) that yields sharp transmission images by deblurring blurry ones. In this method, first, blurry images are
obtained in a short measurement time, then they are deblurred by CNNs with fine-tuning and integrated by linear interpolation.
This method shortens the measurement time indirectly because the process related to the CNNs is fast with GPUs and the blurry
images with low levels of noise intensity do not require a long time. Besides, the fine-tuning may improve the output images’
sharpness. According to our experimental results, the proposed method is fast and can maintain the quality of data to a certain
extent.
Graph embedding methods are effective techniques for representing nodes and their relations in a continuous space. Specifically, the hyperbolic space is more effective than the Euclidean space for embedding graphs with tree-like structures. Thus, it is critical how to select the best dimensionality for the hyperbolic space in which a graph is embedded. This is because we cannot distinguish nodes well with dimensionality that is considerably low, whereas the embedded relations are affected by irregularities in data with excessively high dimensionality. We consider this problem from the viewpoint of statistical model selection for latent variable models. Thereafter, we propose a novel methodology for dimensionality selection based on the minimum description length principle. We aim to introduce a latent variable modeling of hyperbolic embeddings and apply the decomposed normalized maximum likelihood code-length to latent variable model selection. We empirically demonstrated the effectiveness of our method using both synthetic and real-world datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.