Time series analysis, as an application for high dimensional data mining, is a common task in biochemistry, meteorology, climate research, bio-medicine or marketing. Similarity search in data with increasing dimensionality results in an exponential growth of the search space, referred to as Curse of Dimensionality. A common approach to postpone this effect is to apply approximation to reduce the dimensionality of the original data prior to indexing. However, approximation involves loss of information, which also leads to an exponential growth of the search space. Therefore, indexing an approximation with a high dimensionality, i.e. high quality, is desirable.We introduce Symbolic Fourier Approximation (SFA) and the SFA trie which allows for indexing of not only large datasets but also high dimensional approximations. This is done by exploiting the trade-off between the quality of the approximation and the degeneration of the index by using a variable number of dimensions to represent each approximation. Our experiments show that SFA combined with the SFA trie can scale up to a factor of 5-10 more indexed dimensions than previous approaches. Thus, it provides lower page accesses and CPU costs by a factor of 2-25 respectively 2-11 for exact similarity search using real world and synthetic data.
A lease is a token which grants its owner exclusive access to a resource for a defined span of time. In order to be able to tolerate failures, leases need to be coordinated by distributed processes. We present FaTLease, an algorithm for faulttolerant lease negotiation in distributed systems. It is built on the Paxos algorithm for distributed consensus, but avoids Paxos' main performance bottleneck of requiring persistent state. This property makes our algorithm particularly useful for applications that can not dispense any disk bandwidth. Our experiments show that FaTLease scales up to tens of thousands of concurrent leases and can negotiate thousands of leases per second in both LAN and WAN environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.