Peer reviewed eScholarship.orgPowered by the California Digital Library University of California largest absolute correlations with the label. However, he or she verifies the correlations (with the label) on the holdout set and uses only those variables whose correlation agrees in sign with the correlation on the training set and for which both correlations are larger than some threshold in absolute value. The analyst then creates a simple linear threshold classifier on the selected variables using only the signs of the correlations of the selected variables. A final test evaluates the classification accuracy of the classifier on the holdout set. Full details of the analyst's algorithm can be found in section 3 of (17). In our first experiment, each attribute is drawn independently from the normal distribution N(0,1), and we choose the class label y ∈ f−1; 1g uniformly at random so that there is no correlation between the data point and its label. We chose n = 10,000 and d = 10,000 and varied the number of selected variables k. In this scenario no classifier can achieve true accuracy better than 50%. Nevertheless, reusing a standard holdout results in reported accuracy of >63 ± 0.4% for k = 500 on both the training set and the holdout set. The average and standard deviation of results obtained from 100 independent executions of the experiment are plotted in Fig. 1A, which also includes the accuracy of the classifier on another fresh data set of size n drawn from the same distribution. We then executed the same algorithm with our reusable holdout. The algorithm Thresholdout was invoked with T = 0.04 and t = 0.01, which explains why the accuracy of the classifier reported by Thresholdout is off by up to 0.04 whenever the accuracy on the holdout set is within 0.04 of the accuracy on the training set. Thresholdout prevents the algorithm from overfitting to the holdout set and gives a valid estimate of classifier accuracy. In Fig. 1B, we plot the accuracy of the classifier as reported by Thresholdout. In addition, in fig. S2 we include a plot of the actual accuracy of the produced classifier on the holdout set.In our second experiment, the class labels are correlated with some of the variables. As before, the label is randomly chosen from {-1,1} and each of the attributes is drawn from N(0,1), aside from 20 attributes drawn from N(y·0.06,1), where y is the class label. We execute the same algorithm on this data with both the standard holdout and Thresholdout and plot the results in Fig. 2. Our experiment shows that when using the reusable holdout, the algorithm still finds a good classifier while preventing overfitting.Overfitting to the standard holdout set arises in our experiment because the analyst reuses the holdout after using it to measure the correlation of single attributes. We first note that neither cross-validation nor bootstrap resolve this issue. If we used either of these methods to validate the correlations, overfitting would still arise as a result of using the same data for training and validation (...
For uniform frequency stepped pulse trains, there can be undesirable peaks of the autocorrelation function, known as "grating lobes". In this paper we address this issue, using an approach which allows us to suppress grating lobes below a desired threshold level in the case of appropriately chosen stepped frequency waveforms, i.e., sequences of narrowband pulses that span the desired bandwidth. We discuss in detail how to choose relevant parameters in order to produce such waveforms with small grating lobes, and give examples of waveforms with small overlap ratio. We also discuss the issue of high sidelobes in the vicinity of the main lobe, which are inevitable in a train of LFM waveforms, and show that it is possible to suppress these, as well as the grating lobes, by means of phase modulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.