Any program tasked with the evaluation and acquisition of algorithms for use in deployed scenarios must have an impartial, repeatable, and auditable means of benchmarking both candidate and fielded algorithms. Success in this endeavor requires a body of representative sensor data, data labels indicating the proper algorithmic response to the data as adjudicated by subject matter experts, a means of executing algorithms under review against the data, and the ability to automatically score and report algorithm performance. Each of these capabilities should be constructed in support of program and mission goals. By curating and maintaining data, labels, tests, and scoring methodology, a program can understand and continually improve the relationship between benchmarked and fielded performance of acquired algorithms. A system supporting these program needs, deployed in an environment with sufficient computational power and necessary security controls is a powerful tool for ensuring due diligence in evaluation and acquisition of mission critical algorithms. This paper describes the Seascape system and its place in such a process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.