“…Although the annotation methods-and outcomes of interest-differed across all studies, most of the reports speci ed de nitions or protocols prior to the annotation process to classify the alarms; 10 reports focused on the classi cation of alarms into true and false [16, 17, 30-33, 39, 40], audible [38] or not, or based on the alarm type [36]. The remaining 13 reports evaluated if the alarm required or was followed by a medical action such as a therapeutic or diagnostic intervention while using different de nitions and terms: "true positive, clinically relevant" [15], "clinically relevant" [3,23,24,34], "relevant"[28], "relevant or true" [25], "true" [26,27], "actionable" [29,35,41], and "consistent." [37] The researchers mostly used data from monitoring systems including waveforms, measurements, alarms, and settings.…”