Five simple soft sensor methodologies with two update conditions were compared on two experimentally-obtained datasets and one simulated dataset. The soft sensors investigated were moving window partial least squares regression (and a recursive variant), moving window random forest regression, the mean moving window of y, and a novel random forest partial least squares regression ensemble (RF-PLS), all of which can be used with small sample sizes so that they can be rapidly placed online. It was found that, on two of the datasets studied, small window sizes led to the lowest prediction errors for all of the moving window methods studied. On the majority of datasets studied, the RF-PLS calibration method offered the lowest onestep-ahead prediction errors compared to those of the other methods, and it demonstrated greater predictive stability at larger time delays than moving window PLS alone. It was found that both the random forest and RF-PLS methods most adequately modeled the datasets that did not feature purely monotonic increases in property values, but that both methods performed more poorly than moving window PLS models on one dataset with purely monotonic property values. Other data dependent findings are presented and discussed.
The construction of the unconditionally stable planar rank 2 scattering (S) matrix for stratified systems is detailed from Fresnel equations. Several matrix decompositions and numerical calculations performed on the planar S matrix allow for the expedient characterization of purely absorbing, brewster, surface plasmon, and wave-guide modes. A figure of merit is presented from the decompositions of the scattering matrix constructed from the Chandezon method for corrugated surfaces. This figure of merit represents the hyperarea of the scattering matrix transform and allows for rapid distinguishability between lossy absorption phenomena, and surface plasmons. Some extension of this technique is possible for surface plasmon polaritons in the infrared region.
Exploratory data analysis is crucial for developing and understanding classification models from high-dimensional datasets. We explore the utility of a new unsupervised tree ensemble called uncharted forest for visualizing class associations, sample-sample associations, class heterogeneity, and uninformative classes for provenance studies. The uncharted forest algorithm can be used to partition data using random selections of variables and metrics based on statistical spread. After each tree is grown, a tally of the samples that arrive at every terminal node is maintained. Those tallies are stored in single sample association matrix and a likelihood measure for each sample being partitioned with one another can be made. That matrix may be readily viewed as a heat map, and the probabilities can be quantified via new metrics that account for class or cluster membership. We display the advantages and limitations of using this technique by applying it to two classification datasets and three two provenance study datasets. Two of the metrics presented in this paper are also compared with widely used metrics from two algorithms that have variance-based clustering mechanisms.
Optimized spatial partitioning algorithms are the corner stone of many successful experimental designs and statistical methods. Of these algorithms, the Centroidal Voronoi Tessellation (CVT) is the most widely utilized. CVT based methods require global knowledge of spatial boundaries, do not readily allow for weighted regions, have challenging implementations, and are inefficiently extended to high dimensional spaces. We describe two simple partitioning schemes based on nearest and next nearest neighbor locations which easily incorporate these features at the slight expense of optimal placement. Several novel qualitative techniques which assess these partitioning schemes are also included. The feasibility of autonomous uninformed sensor networks utilizing these algorithms are considered. Some improvements in particle swarm optimizer results on multimodal test functions from partitioned initial positions in two space are also illustrated. Pseudo code for all of the novel algorithms depicted here-in is available in the supplementary information of this manuscript.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.