Dimensionality reduction methods, also known as projections, are often used to explore multidimensional data in machine learning, data science, and information visualization. However, several such methods, such as the well-known t-distributed stochastic neighbor embedding and its variants, are computationally expensive for large datasets, suffer from stability problems, and cannot directly handle out-of-sample data. We propose a learning approach to construct any such projections. We train a deep neural network based on sample set drawn from a given data universe, and their corresponding two-dimensional projections, compute with any user-chosen technique. Next, we use the network to infer projections of any dataset from the same universe. Our approach generates projections with similar characteristics as the learned ones, is computationally two to four orders of magnitude faster than existing projection methods, has no complex-to-set user parameters, handles out-of-sample data in a stable manner, and can be used to learn any projection technique. We demonstrate our proposal on several real-world high-dimensional datasets from machine learning.
The Southern Photometric Local Universe Survey (S-PLUS) is imaging ∼9300 deg2 of the celestial sphere in 12 optical bands using a dedicated 0.8 m robotic telescope, the T80-South, at the Cerro Tololo Inter-american Observatory, Chile. The telescope is equipped with a 9.2k × 9.2k e2v detector with 10 $\rm {\mu m}$ pixels, resulting in a field of view of 2 deg2 with a plate scale of 0.55 arcsec pixel−1. The survey consists of four main subfields, which include two non-contiguous fields at high Galactic latitudes (|b| > 30°, 8000 deg2) and two areas of the Galactic Disc and Bulge (for an additional 1300 deg2). S-PLUS uses the Javalambre 12-band magnitude system, which includes the 5 ugriz broad-band filters and 7 narrow-band filters centred on prominent stellar spectral features: the Balmer jump/[OII], Ca H + K, H δ, G band, Mg b triplet, H α, and the Ca triplet. S-PLUS delivers accurate photometric redshifts (δz/(1 + z) = 0.02 or better) for galaxies with r < 19.7 AB mag and z < 0.4, thus producing a 3D map of the local Universe over a volume of more than $1\, (\mathrm{Gpc}/h)^3$. The final S-PLUS catalogue will also enable the study of star formation and stellar populations in and around the Milky Way and nearby galaxies, as well as searches for quasars, variable sources, and low-metallicity stars. In this paper we introduce the main characteristics of the survey, illustrated with science verification data highlighting the unique capabilities of S-PLUS. We also present the first public data release of ∼336 deg2 of the Stripe 82 area, in 12 bands, to a limiting magnitude of r = 21, available at datalab.noao.edu/splus.
The design of binary morphological operators that are translation-invariant and locally defined by a finite neighborhood window corresponds to the problem of designing Boolean functions. As in any supervised classification problem, morphological operators designed from training sample also suffer from overfitting. Large neighborhood tends to lead to performance degradation of the designed operator. This work proposes a multi-level design approach to deal with the issue of designing large neighborhood based operators. The main idea is inspired from stacked generalization (a multi-level classifier design approach) and consists in, at each training level, combining the outcomes of the previous level operators. The final operator is a multi-level operator that ultimately depends on a larger neighborhood than of the individual operators that have been combined. Experimental results show that two-level operators obtained by combining operators designed on subwindows of a large window consistently outperforms the single-level operators designed on the full window. They also show that iterating two-level operators is an effective multi-level approach to obtain better results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.