Inspired by the collective behavior of fish schools, the fish school search (FSS) algorithm is a technique for finding globally optimal solutions. The algorithm is characterized by its simplicity and high performance; FSS is computationally inexpensive, compared to other evolution-inspired algorithms. However, the premature convergence problem is inherent to FSS, especially in the optimization of functions that are in very-high-dimensional spaces and have plenty of local minima or maxima. The accuracy of the obtained solution highly depends on the initial distribution of agents in the search space and on the predefined initial individual and collective-volitive movement step sizes. In this paper, we provide a study of different chaotic maps with symmetric distributions, used as pseudorandom number generators (PRNGs) in FSS. In addition, we incorporate exponential step decay in order to improve the accuracy of the solutions produced by the algorithm. The obtained results of the conducted numerical experiments show that the use of chaotic maps instead of other commonly used high-quality PRNGs can speed up the algorithm, and the incorporated exponential step decay can improve the accuracy of the obtained solution. Different pseudorandom number distributions produced by the considered chaotic maps can positively affect the accuracy of the algorithm in different optimization problems. Overall, the use of the uniform pseudorandom number distribution generated by the tent map produced the most accurate results. Moreover, the tent-map-based PRNG achieved the best performance when compared to other chaotic maps and nonchaotic PRNGs. To demonstrate the effectiveness of the proposed optimization technique, we provide a comparison of the tent-map-based FSS algorithm with exponential step decay (ETFSS) with particle swarm optimization (PSO) and with the genetic algorithm with tournament selection (GA) on test functions for optimization.
Objectives. Recent research in machine learning and artificial intelligence aimed at improving prediction accuracy and reducing computational complexity resulted in a novel neural network architecture referred to as an extreme learning machine (ELM). An ELM comprises a single-hidden-layer feedforward neural network in which the weights of connections among input-layer neurons and hidden-layer neurons are initialized randomly, while the weights of connections among hidden-layer neurons and output-layer neurons are computed using a generalized Moore– Penrose pseudoinverse operation. The replacement of the iterative learning process currently used in many neural network architectures with the random initialization of input weights and the explicit computation of output weights significantly increases the performance of this novel machine learning algorithm while preserving good generalization performance. However, since the random initialization of input weights does not necessarily guarantee optimal prediction accuracy, the purpose of the present work was to develop and study approaches to intelligent adjustment of input weights in ELMs using bioinspired algorithms in order to improve the prediction accuracy of this data analysis tool in regression problems.Methods. Methods of optimization theory, theory of evolutionary computation and swarm intelligence, probability theory, mathematical statistics and systems analysis were used.Results. Approaches to the intelligent adjustment of input weights in ELMs were developed and studied. These approaches are based on the genetic algorithm, the particle swarm algorithm, the fish school search algorithm, as well as the chaotic fish school search algorithm with exponential step decay proposed by the authors. By adjusting input weights with bioinspired optimization algorithms, it was shown that the prediction accuracy of ELMs in regression problems can be improved to reduce the number of hidden-layer neurons to reach a high prediction accuracy on learning and test datasets. In the considered problems, the best ELM configurations can be obtained using the chaotic fish school search algorithm with exponential step decay.Conclusions. The obtained results showed that the prediction accuracy of ELMs can be improved by using bioinspired algorithms for the intelligent adjustment of input weights. Additional calculations are required to adjust the weights; therefore, the use of ELMs in combination with bioinspired algorithms may be advisable where it is necessary to obtain the most accurate and most compact ELM configuration.
This paper presents a dataset containing automatically collected source codes solving unique programming exercises of different types. The programming exercises were automatically generated by the Digital Teaching Assistant (DTA) system that automates a massive Python programming course at MIREA—Russian Technological University (RTU MIREA). Source codes of the small programs grouped by the type of the solved task can be used for benchmarking source code classification and clustering algorithms. Moreover, the data can be used for training intelligent program synthesizers or benchmarking mutation testing frameworks, and more applications are yet to be discovered. We describe the architecture of the DTA system, aiming to provide detailed insight regarding how and why the dataset was collected. In addition, we describe the algorithms responsible for source code analysis in the DTA system. These algorithms use vector representations of programs based on Markov chains, compute pairwise Jensen–Shannon divergences of programs, and apply hierarchical clustering algorithms in order to automatically discover high-level concepts used by students while solving unique tasks. The proposed approach can be incorporated into massive programming courses when there is a need to identify approaches implemented by students.
Dimensionality reduction techniques are often used by researchers in order to make high dimensional data easier to interpret visually, as data visualization is only possible in low dimensional spaces. Recent research in nonlinear dimensionality reduction introduced many effective algorithms, including t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), dimensionality reduction technique based on triplet constraints (TriMAP), and pairwise controlled manifold approximation (PaCMAP), aimed to preserve both the local and global structure of high dimensional data while reducing the dimensionality. The UMAP algorithm has found its application in bioinformatics, genetics, genomics, and has been widely used to improve the accuracy of other machine learning algorithms. In this research, we compare the performance of different fuzzy information discrimination measures used as loss functions in the UMAP algorithm while constructing low dimensional embeddings. In order to achieve this, we derive the gradients of the considered losses analytically and employ the Adam algorithm during the loss function optimization process. From the conducted experimental studies we conclude that the use of either the logarithmic fuzzy cross entropy loss without reduced repulsion or the symmetric logarithmic fuzzy cross entropy loss with sufficiently large neighbor count leads to better global structure preservation of the original multidimensional data when compared to the loss function used in the original UMAP algorithm implementation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.