2018
DOI: 10.1109/tr.2018.2846675
|View full text |Cite
|
Sign up to set email alerts
|

Modeling Impact of Human Errors on the Data Unavailability and Data Loss of Storage Systems

Abstract: Data storage systems and their availability play a crucial role in contemporary datacenters. Despite using mechanisms such as automatic fail-over in datacenters, the role of human agents and consequently their destructive errors is inevitable. Due to very large number of disk drives used in exascale datacenters and their high failure rates, the disk subsystem in storage systems has become a major source of Data Unavailability (DU) and Data Loss (DL) initiated by human errors. In this paper, we investigate the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
7
1

Relationship

4
4

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 52 publications
0
12
0
Order By: Relevance
“…There are different metrics in the literature that are used to measure the data storage system reliability. For example, the authors in [30] propose NOrmalized Magnitude of Data Loss (N OM DL t ) metric (used to measure the expected amount of data lost per usable terabyte within mission time t) as a better alternative to standard Mean Time To Data Loss (MTTDL) metric [31], [32] or a novel metric named NOrmalized Magnitude of Data Unavailability (NOMDU) is proposed in [33] to measure the availability of data storage systems. The authors in [34] have used annualized failure rate (AFR) standard metric to analyze SMART data for describing a disk's fail-stop rate.…”
Section: A Related Workmentioning
confidence: 99%
“…There are different metrics in the literature that are used to measure the data storage system reliability. For example, the authors in [30] propose NOrmalized Magnitude of Data Loss (N OM DL t ) metric (used to measure the expected amount of data lost per usable terabyte within mission time t) as a better alternative to standard Mean Time To Data Loss (MTTDL) metric [31], [32] or a novel metric named NOrmalized Magnitude of Data Unavailability (NOMDU) is proposed in [33] to measure the availability of data storage systems. The authors in [34] have used annualized failure rate (AFR) standard metric to analyze SMART data for describing a disk's fail-stop rate.…”
Section: A Related Workmentioning
confidence: 99%
“…When storing data into storage devices on the ground systems, such as computers, these systems have rich processing and cache resources, which provide powerful hardware resource support for solving the problems and realizing high-speed data storage. However, in the storage system of airborne radar, especially the micro-UAV platform [ 28 , 29 ], the volume, weight and power consumption of the storage system are strictly limited, and computing and cache resources are relatively tight [ 30 , 31 ]. Therefore, higher requirements are put forward for the optimization of the file management of the airborne radar's storage system.…”
Section: Introductionmentioning
confidence: 99%
“…Mathematical representations map component states into system states and are used to compute reliability indices and measures for system evaluations. There are many types of mathematical representations, such as fault trees [3], reliability block diagrams [4], structure functions [4,5], Markov models [6], Petri nets [7], Bayesan belief networks [8,9], credal networks [10,11], survival signatures [12] and others. The choice and use one of these representations depends on application problems and the specifics of risk/reliability analysis.…”
Section: Introductionmentioning
confidence: 99%
“…Advantages of these mathematical representations are simplicity and the possibility to be constructed for a system of any structure complexity [13], but new methods need to be developed for their application in time-dependent reliability analysis, such using a survival signature [12,14], credal network [11,15] or other method [16]. Markov models or Monte Carlo simulations can be accepted for time-depend (dynamic) analysis of system reliability [6,15]. The probabilistic models are used for the mathematical representation in cases of incompletely specified or uncertain data for analysis and relationships among system components that cannot be represented deterministically [2,9].…”
Section: Introductionmentioning
confidence: 99%