Big Data Analytics is an emerging field since massive storage and computing capabilities have been made available by advanced e-infrastructures. Earth and Environmental sciences are likely to benefit from Big Data Analytics techniques supporting the processing of the large number of Earth Observation datasets currently acquired and generated through observations and simulations. However, Earth Science data and applications present specificities in terms of relevance of the geospatial information, wide heterogeneity of data models and formats, and complexity of processing. Therefore, Big Earth Data Analytics requires specifically tailored techniques and tools. The EarthServer Big Earth Data Analytics engine offers a solution for coverage-type datasets, built around a high performance array database technology, and the adoption and enhancement of standards for service interaction (OGC WCS and WCPS). The EarthServer solution, led by the collection of requirements from scientific communities and international initiatives, provides a holistic approach that ranges from query languages and scalability up to mobile access and visualization. The result is demonstrated and validated through the development of lighthouse applications in the Marine, Geology, Atmospheric, Planetary and Cryospheric science domains.
With the unprecedented increase of orbital sensor, in situ measurement, and simulation data there is a rich, yet not leveraged potential for obtaining insights from dissecting datasets and rejoining them with other datasets. Obviously, goal is to allow users to "ask any question, any time, on any size", thereby enabling them to "build their own product on the go".One of the most influential initiatives in EO is EarthServer which has demonstrated new directions for flexible, scalable EO services based on innovative NoSQL
Since array data of arbitrary dimensionality appears in massive amounts in a wide range of application domains, such as geographic information systems, climate simulations, and medical imaging, it has become crucial to build scalable systems for complex query answering in real time. Cloud architectures can be expected to significantly speed up array databases.We present an enhancement of the well-established Array DBMS rasdaman with intra-query distribution capabilities: requests, incoming in the form of database queries, are broken into sub-queries which are then distributed in a network of known peers. The splitting strategies ensure that the network's processing power is fully exploited, while at the same time enabling optimizations such as network traffic minimization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.