Large and diverse datasets can now be simulated with associated truth to train and evaluate AI/ML algorithms. This convergence of readily accessible simulation (SIM) tools, real-time high-performance computing, and large repositories of high-quality, free-to-inexpensive photorealistic scanned assets is a potential artificial intelligence (AI) and machine learning (ML) gamechanger. While this feat is now within our grasp, what SIM data should be generated, how should it be generated, and how can this be achieved in a controlled and scalable fashion? First, we discuss a formal procedural language for specifying scenes (LSCENE) and collecting sampled datasets (LCAP). Second, we discuss specifics regarding our production and storage of data, ground truth, and metadata. Last, two LSCENE/LCAP examples are discussed and three unmanned aerial vehicle (UAV) AI/ML use cases are provided to demonstrate the range and behavior of the proposed ideas. Overall, this article is a step towards closed-loop automated AI/ML design and evaluation.
Current generation artificial intelligence (AI) is heavily reliant on data and supervised learning (SL). However, dense and accurate truth for SL is often a bottleneck and any imperfections can negatively impact performance and/or result in biases. As a result, several corrective lines of research are being explored, including simulation (SIM). In this article, we discuss fundamental limitations in obtaining truth, both in the physical universe and SIM, and different truth uncertainty modeling strategies are explored. A case study from data-driven monocular vision is provided. These experiments demonstrate performance variability with respect to different truth uncertainty strategies in training and evaluating AI algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.