Multi-objective verification problems of parametric Markov decision processes under optimality criteria can be naturally expressed as nonlinear programs. We observe that many of these computationally demanding problems belong to the subclass of signomial programs. This insight allows for a sequential optimization algorithm to efficiently compute sound but possibly suboptimal solutions. Each stage of this algorithm solves a geometric programming problem. These geometric programs are obtained by convexifying the nonconvex constraints of the original problem. Direct applications of the encodings as nonlinear programs are model repair and parameter synthesis. We demonstrate the scalability and quality of our approach by well-known benchmarks.
We consider the problem of learning from demonstration, where extra side information about the demonstration is encoded as a co-safe linear temporal logic formula. We address two known limitations of existing methods that do not account for such side information. First, the policies that result from existing methods, while matching the expected features or likelihood of the demonstrations, may still be in conflict with high-level objectives not explicit in the demonstration trajectories. Second, existing methods fail to provide a priori guarantees on the out-of-sample generalization performance with respect to such high-level goals. This lack of formal guarantees can prevent the application of learning from demonstration to safety-
critical systems, especially when inference to state space regions with poor demonstration coverage is required. In this work, we show that side information, when explicitly taken into account, indeed improves the performance and safety of the learned policy with respect to task implementation. Moreover, we describe an automated procedure to systematically generate the features that encode side information expressed in temporal logic.
We investigate a sampling-based method for optimal control of continuous-time and continuous-state (possibly nonlinear) systems under co-safe linear temporal logic specifications. We express the temporal logic specification as a deterministic, finite automaton (the specification automaton), and link the automaton's discrete transitions to the continuous system state as it passes through specified regions. The optimal hybrid controller is characterized by a set of coupled partial di↵erential equations. Because these equations are di cult to solve exactly in practice in all cases, we propose instead a sampling based technique to solve for an approximate controller through approximate value iteration. We adopt model reference adaptive search-an importance sampling optimization algorithm-to determine the mixing weights of the approximate value function expressed in a finite basis. Under mild technical assumptions, the algorithm converges, with probability one, to an optimal weight that ensures the satisfaction of temporal logic constraints, while minimizing an upper bound for the optimal cost. We demonstrate the correctness and e ciency of the method through numerical experiments, including temporal logic planning for a linear system, and a nonlinear mobile robot.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.