Modern science often requires the execution of large-scale, multi-stage simulation and data analysis pipelines to enable the study of complex systems. The amount of computation and data involved in these pipelines requires scalable workflow management systems that are able to reliably and efficiently coordinate and automate data movement and task execution on distributed computational resources: campus clusters, national cyberinfrastructures, and commercial and academic clouds. This paper describes the design, development and evolution of the Pegasus Workflow Management System, which maps abstract workflow descriptions onto distributed computing infrastructures. Pegasus has been used for more than twelve years by scientists in a wide variety of domains, including astronomy, seismology, bioinformatics, physics and others. This paper provides an integrated view of the Pegasus system, showing its capabilities that have been developed over time in response to application needs and to the evolution of the scientific computing platforms. The paper describes how Pegasus achieves reliable, scalable workflow execution across a wide variety of computing infrastructures.
We have successfully applied full-3-D tomography (F3DT) based on a combination of the scattering-integral method (SI-F3DT) and the adjoint-wavefield method (AW-F3DT) to iteratively improve a 3-D starting model, the Southern California Earthquake Center (SCEC) Community Velocity Model version 4.0 (CVM-S4). In F3DT, the sensitivity (Fréchet) kernels are computed using numerical solutions of the 3-D elastodynamic equation and the nonlinearity of the structural inversion problem is accounted for through an iterative tomographic navigation process. More than half-a-million misfit measurements made on about 38,000 earthquake seismograms and 12,000 ambient-noise correlagrams have been assimilated into our inversion. After 26 F3DT iterations, synthetic seismograms computed using our latest model, CVM-S4.26, show substantially better fit to observed seismograms at frequencies below 0.2 Hz than those computed using our 3-D starting model CVM-S4 and the other SCEC CVM, CVM-H11.9, which was improved through 16 iterations of AW-F3DT. CVM-S4.26 has revealed strong crustal heterogeneities throughout Southern California, some of which are completely missing in CVM-S4 and CVM-H11.9 but exist in models obtained from previous crustal-scale 2-D active-source refraction tomography models. At shallow depths, our model shows strong correlation with sedimentary basins and reveals velocity contrasts across major mapped strike-slip and dip-slip faults. At middle to lower crustal depths, structural features in our model may provide new insights into regional tectonics. When combined with physics-based seismic hazard analysis tools, we expect our model to provide more accurate estimates of seismic hazards in Southern California.
CyberShake, as part of the Southern California Earthquake Center's (SCEC) Community Modeling Environment, is developing a methodology that explicitly incorporates deterministic source and wave propagation effects within seismic hazard calculations through the use of physics-based 3D ground motion simulations. To calculate a waveform-based seismic hazard estimate for a site of interest, we begin with Uniform California Earthquake Rupture Forecast, Version 2.0 (UCERF2.0) and identify all ruptures within 200 km of the site of interest. We convert the UCERF2.0 rupture definition into multiple rupture variations with differing hypocenter locations and slip distributions, resulting in about 415,000 rupture variations per site. Strain Green Tensors are calculated for the site of interest using the SCEC Community Velocity Model, Version 4 (CVM4), and then, using reciprocity, we calculate synthetic seismograms for each rupture variation. Peak intensity measures are then extracted from these synthetics and combined with the original rupture probabilities to produce probabilistic seismic hazard curves for the site. Being explicitly site-based, CyberShake directly samples the ground motion variability at that site over many earthquake cycles (i.e., rupture scenarios) and alleviates the need for the ergodic assumption that is implicitly included in traditional empirically based calculations. Thus far, we have simulated ruptures at over 200 sites in the Los Angeles region for ground shaking periods of 2 s and longer, providing the basis for the first generation CyberShake hazard maps. Our results indicate that the combination of rupture directivity and basin response effects can lead to an increase in the hazard level for some sites, relative to that given by a conventional Ground Motion Prediction Equation (GMPE). Additionally, and perhaps more importantly, we find that the physics-based hazard results are much more sensitive to the assumed magnitude-area relations and magnitude uncertainty estimates used in the definition of the ruptures than is found in the traditional GMPE approach. This reinforces the need for continued development of a better understanding of earthquake source characterization and the constitutive relations that govern the earthquake rupture process.
The GTL implemented in the USR was based on Ely et al. (2010) and uses the geology-based Vs30 maps of Wills and Clahan (2006) to specify velocity values at the Earths surface in the voxet. V P , and in turn density, are inferred from surface V S using the scaling laws of Brocher (2005). These values were parameterized to a depth of z T = 350 meters with the following formulations:where z ′ is depth, V ST and V P T are are S-and P-wave velocities extracted from the crustal velocity model at depth z T , P () is the Brocher (2005) P-wave velocity scaling law, andThe coefficient a controls the ratio of surface velocity to original 30 meter average, b controls overall curvature, and c controls near-surface curvature of the velocity profile. The coefficients a = 1/2, b = 2/3, and c = 3/2 were chosen to fit the generic rock profile of Boore and Joyner (1997) while also producing smooth and well-behaved profiles when combined with the underlying basin and crustal velocity models (Ely et al., 2010) ( Figure 7). S2 Model validation, comparison, and uncertaintyThe velocity model (CVM) component of the USR described here is assembled from several different data sets and models, and thus it is challenging to formally assess model resolution and uncertainties. One clear step for the sedimentary basins is to assess the variability in well data that is not represented in the final model. As we discussed, these data measure interval transit times over borehole distances of less than 1 m, whereas the velocity model uses smoothed (25 m sampled) versions of these data. To make this assessment, we compared observations directly with the velocity values represented at 108 well bore locations in the Los Angeles basin. Our analysis shows a standard deviation of 6.5% around a mean of 1.0 for the ratio between compressional wave slowness in logs and the model in a population of ca. 1.1 million samples. This corresponds to a standard deviation in V P of ±99 m/s at 2000 m/s.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.