The training problem for feedforward neural networks is nonlinear parameter estimation that can be solved by a variety of optimization techniques. Much of the literature on neural networks has focused on variants of gradient descent. The training of neural networks using such techniques is known to be a slow process with more sophisticated techniques not always performing signi cantly better. In this paper, we show that feedforward neural networks can have ill-conditioned Hessians and that this ill-conditioning can be quite common. The analysis and experimental results in this paper lead to the conclusion that many network training problems are ill-conditioned and may not be solved more e ciently by higher-order optimization methods. While our analyses are for completely connected layered networks, they extend to networks with sparse connectivity as well. Our results suggest that neural networks can have considerable redundancy in parameterizing the function space in a neighborhood of a local minimum, independently of whether or not the solution has a small residual.1. Introduction. Some neural network techniques are, in a strictly mathematical sense, an approach to function approximation. As with most approximation methods, they require the estimation of certain (possibly nonunique) parameters which are de ned by the problem to be solved 14]. In neural network terminology, nding those parameters is called the training problem, and algorithms for nding them are called training algorithms. This nomenclature comes from analogy with biological systems, since a set of inputs to the function to be approximated are presented to the network, and the parameters are adjusted to make the output of the network close in some sense to the known value of the function.Feed-forward neural networks use a speci c parameterized functional form to approximate a desired input/output relation. Typically, a system is sampled resulting in a nite set of pairs (t; ) 2 R p R where the rst coordinate is a position in pdimensional space and the second coordinate refers to the assigned value for the point.The feedforward neural network function, also from R p 7 ! R, has a set of parameters, called weights, which have to be determined so that the input and output values as given by the sample data are matched as closely as possible by the approximating neural network. The neural network function for the i th input pattern (i = 1; 2; : : :; m) can be written succinctly in the form
The purpose of this paper is to explore ways to generate random walks and polygons in confinement with a bias toward stiffness. Here the stiffness refers to the curvature angle between two consecutive edges along the random walk or polygon. The stiffer the walk (polygon), the smaller this angle on average. Thus random walks and polygons with an elevated stiffness have lower than expected curvatures. The authors introduced and studied several generation algorithms with a stiffness parameter s > 0 that regulates the expected curvature angle at a given vertex in which the random walks and polygons are generated one edge at a time using conditional probability density functions. Our generating algorithms also allow the generation of unconfined random walks and polygons with any desired mean curvature angle. In the case of random walks and polygons confined in a sphere of fixed radius, we observe that, as expected, stiff random walks or polygons are more likely to be close to the confinement boundary. The methods developed here require that the random walks and random polygons be rooted at the center of the confinement sphere.
Context Linear Temporal Logic () has been used widely in verification. Its importance and popularity have only grown with the revival of temporal logic synthesis, and with new uses of in robotics and planning activities. All these uses demand that the user have a clear understanding of what an specification means.Inquiry Despite the growing use of , no studies have investigated the misconceptions users actually have in understanding formulas. This paper addresses the gap with a first study of misconceptions.Approach We study researchers' and learners' understanding of in four rounds (three written surveys, one talk-aloud) spread across a two-year timeframe. Concretely, we decompose "understanding " into three questions. A person reading a spec needs to understand what it is saying, so we study the mapping from to English. A person writing a spec needs to go in the other direction, so we study English to . However, misconceptions could arise from two sources: a misunderstanding of 's syntax or of its underlying semantics. Therefore, we also study the relationship between formulas and specific traces.Knowledge We find several misconceptions that have consequences for learners, tool builders, and designers of new property languages. These findings are already resulting in changes to the Alloy modeling language. We also find that the English to direction was the most common source of errors; unfortunately, this is the critical "authoring" direction in which a subtle mistake can lead to a faulty system. We contribute study instruments that are useful for training learners (whether academic or industrial) who are getting acquainted with , and we provide a code book to assist in the analysis of responses to similar-style questions.Grounding Our findings are grounded in the responses to our survey rounds. Round used Quizius to identify misconceptions among learners in a way that reduces the threat of expert blind spots. Rounds and confirm that both additional learners and researchers (who work in formal methods, robotics, and related fields) make similar errors. Round adds deep support for our misconceptions via talk-aloud surveys.Importance This work provides useful answers to two critical but unexplored questions: in what ways is tricky and what can be done about it? Our survey instruments can serve as a starting point for other studies.
Generating questions to engage and measure students is often challenging and time-consuming. Furthermore, these questions do not always transfer well between student populations due to differences in background, course emphasis, or ambiguity in the questions or answers. We introduce a contributing student pedagogy activity facilitated by machine learning that can generate questions with associated answer-reasoning sets. We call this process Adaptive Tool-Driven Conception Generation. A tool implementing this process has been deployed, and it explicitly optimizes the process for questions that divide student opinion. In a study involving arrays in Java, this novel process: generates questions similar to expertdesigned questions, produces novel questions that identify potential student misconceptions, and provides statistical estimates of the prevalence of misconceptions. This process allows the generation of quiz and discussion questions with less expert effort, facilitates a subprocess in the creation of concept inventories, and also raises the possibility of running reproduction studies relatively cheaply. CCS CONCEPTS • Social and professional topics → Student assessment; • Computing methodologies → Machine learning;
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.