Many visual analytics systems allow users to interact with machine learning models towards the goals of data exploration and insight generation on a given dataset. However, in some situations, insights may be less important than the production of an accurate predictive model for future use. In that case, users are more interested in generating of diverse and robust predictive models, verifying their performance on holdout data, and selecting the most suitable model for their usage scenario. In this paper, we consider the concept of Exploratory Model Analysis (EMA), which is defined as the process of discovering and selecting relevant models that can be used to make predictions on a data source. We delineate the differences between EMA and the well‐known term exploratory data analysis in terms of the desired outcome of the analytic process: insights into the data or a set of deployable models. The contributions of this work are a visual analytics system workflow for EMA, a user study, and two use cases validating the effectiveness of the workflow. We found that our system workflow enabled users to generate complex models, to assess them for various qualities, and to select the most relevant model for their task.
Projection techniques are often used to visualize high-dimensional data, allowing users to better understand the overall structure of multi-dimensional spaces on a 2D screen. Although many such methods exist, comparably little work has been done on generalizable methods of inverse-projection -the process of mapping the projected points, or more generally, the projection space back to the original high-dimensional space. In this article we present NNInv, a deep learning technique with the ability to approximate the inverse of any projection or mapping. NNInv learns to reconstruct high-dimensional data from any arbitrary point on a 2D projection space, giving users the ability to interact with the learned high-dimensional representation in a visual analytics system. We provide an analysis of the parameter space of NNInv, and offer guidance in selecting these parameters. We extend validation of the effectiveness of NNInv through a series of quantitative and qualitative analyses. We then demonstrate the method's utility by applying it to three visualization tasks: interactive instance interpolation, classifier agreement, and gradient visualization.
Photoswitches are molecules that undergo a reversible, structural isomerization after exposure to different wavelengths of light. The dynamic control offered by molecular photoswitches is favorable for applications in materials chemistry, photopharmacology, and catalysis. Ideal photoswitches absorb visible light and have long-lived metastable isomers. We used high throughput virtual screening to predict the absorption maxima (λ max ) of the E-isomer and half-lives (t 1/2 ) of the Z-isomer. However, computing the photophysical and kinetic properties of each entry of a virtual molecular library containing 10 3 -10 6 entries with density functional theory is prohibitively time-consuming. We applied active search, a machine learning technique to intelligently search a chemical search space of 255 991 photoswitches based on 29 known azoarenes and their derivatives. We iteratively trained the active search algorithm based on whether a candidate absorbed visible light (λ max > 450 nm). Active search was found to triple the discovery 1 rate compared to random search. Further, we projected 1 962 photoswitches to 2D using the Uniform Manifold Approximation and Projection (umap) algorithm and found that λ max depends on the core, which is tunable with substituents. We then incorporated a second stage of screening with to predict the stabilities of the Z-isomers for the top 1% of candidates. We identified four ideal photoswitches that concurrently satisfy λ max > 450 nm and t 1/2 > 2 hours; the range of λ max and t 1/2 range from 465 to 531 nm and hours to days, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.