Deep generative models have been praised for their ability to learn smooth latent representation of images, text, and audio, which can then be used to generate new, plausible data. However, current generative models are unable to work with molecular graphs due to their unique characteristics-their underlying structure is not Euclidean or grid-like, they remain isomorphic under permutation of the nodes labels, and they come with a different number of nodes and edges. In this paper, we first propose a novel variational autoencoder for molecular graphs, whose encoder and decoder are specially designed to account for the above properties by means of several technical innovations. Moreover, in contrast with the state of the art, our decoder is able to provide the spatial coordinates of the atoms of the molecules it generates. Then, we develop a gradient-based algorithm to optimize the decoder of our model so that it learns to generate molecules that maximize the value of certain property of interest and, given a molecule of interest, it is able to optimize the spatial configuration of its atoms for greater stability. Experiments reveal that our variational autoencoder can discover plausible, diverse and novel molecules more effectively than several state of the art models. Moreover, for several properties of interest, our optimized decoder is able to identify molecules with property values 121% higher than those identified by several state of the art methods based on Bayesian optimization and reinforcement learning. *
Spaced repetition is a technique for efficient memorization which uses repeated review of content following a schedule determined by a spaced repetition algorithm to improve long-term retention. However, current spaced repetition algorithms are simple rule-based heuristics with a few hard-coded parameters. Here, we introduce a flexible representation of spaced repetition using the framework of marked temporal point processes and then address the design of spaced repetition algorithms with provable guarantees as an optimal control problem for stochastic differential equations with jumps. For two well-known human memory models, we show that, if the learner aims to maximize recall probability of the content to be learned subject to a cost on the reviewing frequency, the optimal reviewing schedule is given by the recall probability itself. As a result, we can then develop a simple, scalable online spaced repetition algorithm, MEMORIZE, to determine the optimal reviewing times. We perform a large-scale natural experiment using data from Duolingo, a popular language-learning online platform, and show that learners who follow a reviewing schedule determined by our algorithm memorize more effectively than learners who follow alternative schedules determined by several heuristics.
Many social networks are characterized by actors (nodes) holding quantitative opinions about movies, songs, sports, people, colleges, politicians, and so on. These opinions are influenced by network neighbors. Many models have been proposed for such opinion dynamics, but they have some limitations. Most consider the strength of edge influence as fixed. Some model a discrete decision or action on part of each actor, and an edge as causing an "infection" (that is often permanent or self-resolving). Others model edge influence as a stochastic matrix to reuse the mathematics of eigensystems. Actors' opinions are usually observed globally and synchronously. Analysis usually skirts transient effects and focuses on steady-state behavior. There is very little direct experimental validation of estimated influence models. Here we initiate an investigation into new models that seek to remove these limitations. Our main goal is to estimate, not assume, edge influence strengths from an observed series of opinion values at nodes. We adopt a linear (but not stochastic) influence model. We make no assumptions about system stability or convergence. Further, actors' opinions may be observed in an asynchronous and incomplete fashion, after missing several time steps when an actor changed its opinion based on neighbors' influence. We present novel algorithms to estimate edge influence strengths while tackling these aggressively realistic assumptions. Experiments with Reddit, Twitter, and three social games we conducted on volunteers establish the promise of our algorithms. Our opinion estimation errors are dramatically smaller than strong baselines like the DeGroot, flocking, voter, and biased voter models. Our experiments also lend qualitative insights into asynchronous opinion updates and aggregation.
Decisions are increasingly taken by both humans and machine learning models. However, machine learning models are currently trained for full automation—they are not aware that some of the decisions may still be taken by humans. In this paper, we take a first step towards the development of machine learning models that are optimized to operate under different automation levels. More specifically, we first introduce the problem of ridge regression under human assistance and show that it is NP-hard. Then, we derive an alternative representation of the corresponding objective function as a difference of nondecreasing submodular functions. Building on this representation, we further show that the objective is nondecreasing and satisfies α-submodularity, a recently introduced notion of approximate submodularity. These properties allow a simple and efficient greedy algorithm to enjoy approximation guarantees at solving the problem. Experiments on synthetic and real-world data from two important applications—medical diagnosis and content moderation—demonstrate that the greedy algorithm beats several competitive baselines.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.