Hadoop emerged as the de facto state-of-the-art system for MapReduce-based data analytics. The reliability of Hadoop systems depends in part on how well they handle failures. Currently, Hadoop handles machine failures by re-executing all the tasks of the failed machines (i.e., executing recovery tasks). Unfortunately, this elegant solution is entirely entrusted to the core of Hadoop and hidden from Hadoop schedulers. The unawareness of failures therefore may prevent Hadoop schedulers from operating correctly towards meeting their objectives (e.g., fairness, job priority) and can significantly impact the performance of MapReduce applications. This paper presents Chronos, a failure-aware scheduling strategy that enables an early yet smart action for fast failure recovery while still operating within a specific scheduler objective. Upon failure detection, rather than waiting an uncertain amount of time to get resources for recovery tasks, Chronos leverages a lightweight preemption technique to carefully allocate these resources. In addition, Chronos considers data locality when scheduling recovery tasks to further improve the performance. We demonstrate the utility of Chronos by combining it with Fifo and Fair schedulers. The experimental results show that Chronos recovers to a correct scheduling behavior within a couple of seconds only and reduces the job completion times by up to 55% compared to state-of-the-art schedulers.
Mangrove forests provide vital ecosystem services. The increasing threats to mangrove forest extent and fragmentation can be monitored from space. Accurate spatially explicit quantification of key vegetation characteristics of mangroves, such as leaf area index (LAI), would further advance our monitoring efforts to assess ecosystem health and functioning. Here, we investigated the potential of radiative transfer models (RTM), combined with active learning (AL), to estimate LAI from Sentinel-2 spectral reflectance in the mangrove-dominated region of Ngoc Hien, Vietnam. We validated the retrieval of LAI estimates against insitu measurements based on hemispherical photography and compared against red-edge NDVI and the Sentinel Application Platform (SNAP) biophysical processor. Our results highlight the performance of physics-based machine learning using Gaussian processes regression (GPR) in combination with AL for the estimation of mangrove LAI. Our AL-driven hybrid GPR model substantially outperformed SNAP (R 2 = 0.77 and 0.44 respectively) as well as the red-edge NDVI approach. Comparing two canopy RTMs, the highest accuracy was achieved by PROSAIL (RMSE = 0.13 m 2 .m −2 , NRMSE = 9.57%, MAE = 0.1 m 2 .m −2 ). The successful retrieval of mangrove LAI from Sentinel-2 can overcome extensive reliance on scarce in-situ measurements for training seen in other approaches and present a more scalable applicability by relying on the universal principles of physics in combination with uncertainty estimates. AL-based GPR models using RTM simulations allow us to adapt the genericity of RTMs to the peculiarities of distinct ecosystems such as mangrove forests with limited ancillary data. These findings bode potential for retrieving a wider range of vegetation variables to quantify large-scale mangrove ecosystem dynamics in space and time.
The main purpose of this paper is to assess the land use and land cover (LULC) changes for thirty years, from 1990–2020, in the Dong Thap Muoi, a flooded land area of the Mekong River Delta of Vietnam using Google Earth Engine and random forest algorithm. The specific purposes are: (1) determine the main LULC classes and (2) compute and analyze the magnitude and rate of changes for these LULC classes. For the above purposes, 128 Landsat images, topographic maps, land use status maps, cadastral maps, and ancillary data were collected and utilized to derive the LULC maps using the random forest classification algorithm. The overall accuracy of the LULC maps for 1990, 2000, 2010, and 2020 are 88.9, 83.5, 87.1, and 85.6%, respectively. The result showed that the unused land was dominant in 1990 with 28.9 % of the total area, but it was primarily converted to the paddy, a new dominant LULC class in 2020 (45.1%). The forest was reduced significantly from 14.4% in 1990 to only 5.5% of the total area in 2020. Whereas at the same time, the built-up increased from 0.3% to 6.2% of the total area. This research may help the authorities design exploitation policies for the Dong Thap Muoi’s socio-economic development and develop a new, stable, and sustainable ecosystem, promoting the advantages of the region, early forming a diversified agricultural structure.
Abstract. Large-scale data analysis has increasingly come to rely on MapReduce and its open-source implementation Hadoop. Recently, Hadoop has not only been used for running single batch jobs but it has also been optimized to simultaneously support the execution of multiple jobs belonging to multiple concurrent users. Several schedulers (i.e., Fifo, Fair, and Capacity schedulers) have been proposed to optimize locality executions of tasks but do not consider failures, although, evidence in the literature shows that faults do occur and can probably result in performance problems. In this paper, we have designed a set of experiments to evaluate the performance of Hadoop under failure when applying several schedulers (i.e., explore the conflict between job scheduling, exposing locality executions, and failures). Our results reveal several drawbacks of current Hadoop's mechanism in prioritizing failed tasks. By trying to launch failed tasks as soon as possible regardless of locality, it significantly increases the execution time of jobs with failed tasks, due to two reasons: 1) available resources might not be freed up as quickly as expected and 2) failed tasks might be re-executed on machines with no data on it, introducing extra cost for data transferring through network, which is normally the most scarce resource in today's data-centers. Our preliminary study with Hadoop not only helps us to understand the interplay between fault-tolerance and job scheduling, but also offers useful insights into optimizing the current schedulers to be more efficient in case of failures.
PurposeGroundwater plays a critical part in both natural and human existence. When surface water is scarce in arid climates, groundwater becomes an immensely valuable resource. Dak Lak is an area that frequently lacks water resources for everyday living and production, and the scarcity of water resources is exacerbated during the dry season. As a result, it is critical to do study and understand about groundwater to meet the region's water demand. This study aims to extend the use of the MODFLOW model for groundwater simulation and assess the overall groundwater reserves and water demand in the highland province Dak Lak.Design/methodology/approachThe MODFLOW model is used in this work to compute and analyze the flow, prospective reserves of groundwater from which to plan extraction and estimate groundwater variation in the future.FindingsThe application of the MODFLOW model to Dak Lak province demonstrates that, despite limited data, particularly drilling hole data for subterranean water research, the model's calculation results have demonstrated its reliability and great potential for use in other similar places. The use of the model in conjunction with other data extraction modules is a useful input for creating underground flow module maps for various time periods. The large impact of recharge and evaporation on groundwater supplies and water balance in the research area is demonstrated by simulations of climate change scenarios RCP4.5 and RCP8.5.Originality/valueNone of the studies has been done previously to analyze water resources of Dak Lak and the scarcity of water resources is exacerbated during the dry season. Therefore, this study will provide useful insights in the water resource management and the conservation of Dak Lak. The groundwater in Dak Lak can meet the area's water demand, according to the results obtained and water balance in the study area. However, the management of water resources and rigorous monitoring of groundwater extraction activities in the area should receive more attention.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.