Recent face recognition experiments on a major benchmark (LFW [14]) show stunning performance-a number of algorithms achieve near to perfect score, surpassing human recognition rates. In this paper, we advocate evaluations at the million scale (LFW includes only 13K photos of 5K people). To this end, we have assembled the MegaFace dataset and created the first MegaFace challenge. Our dataset includes One Million photos that capture more than 690K different individuals. The challenge evaluates performance of algorithms with increasing numbers of "distractors" (going from 10 to 1M) in the gallery set. We present both identification and verification performance, evaluate performance with respect to pose and a persons age, and compare as a function of training data size (#photos and #people). We report results of state of the art and baseline algorithms. The MegaFace dataset, baseline code, and evaluation scripts, are all publicly released for further experimentations 1 .
Text classification (TC) is the task of automatically assigning documents to a fixed number of categories. TC is an important component in many text applications. Many of these applications perform preprocessing. There are different types of text preprocessing, e.g., conversion of uppercase letters into lowercase letters, HTML tag removal, stopword removal, punctuation mark removal, lemmatization, correction of common misspelled words, and reduction of replicated characters. We hypothesize that the application of different combinations of preprocessing methods can improve TC results. Therefore, we performed an extensive and systematic set of TC experiments (and this is our main research contribution) to explore the impact of all possible combinations of five/ six basic preprocessing methods on four benchmark text corpora (and not samples of them) using three ML methods and training and test sets. The general conclusion (at least for the datasets verified) is that it is always advisable to perform an extensive and systematic variety of preprocessing methods combined with TC experiments because it contributes to improve TC accuracy. For all the tested datasets, there was always at least one combination of basic preprocessing methods that could be recommended to significantly improve the TC using a BOW representation. For three datasets, stopword removal was the only single preprocessing method that enabled a significant improvement compared to the baseline result using a bag of 1,000-word unigrams. For some of the datasets, there was minimal improvement when we removed HTML tags, performed spelling correction or removed punctuation marks, and reduced replicated characters. However, for the fourth dataset, the stopword removal was not beneficial. Instead, the conversion of uppercase letters into lowercase letters was the only single preprocessing method that demonstrated a significant improvement compared to the baseline result. The best result for this dataset was obtained when we performed spelling correction and conversion into lowercase letters. In general, for all the datasets processed, there was always at least one combination of basic preprocessing methods that could be recommended to improve the accuracy results when using a bag-of-words representation.
Both human activity and climate change can influence erosion rates and initiate rapid landscape change. Understanding the relative impact of these factors is critical to managing the risks of extreme erosion related to flooding and landslide occurrence. Here we present a 2100 year record of sediment mass accumulation and inferred erosion based on lacustrine sediment cores from Amherst Lake, Vermont, USA. Using deposition from August 2011 Tropical Storm Irene as a modern analogue, we identified distinct event deposits indicative of destructive erosion events. These deposits record a prolonged (multidecadal) interval of enhanced erosion following the initial storm‐induced landscape disturbance. The direct impact of human land cover alteration is minimal in comparison to the more recent twentieth century increase in the occurrence of catastrophic erosion linked to overall wetter conditions that favor high erosion rates and more easily trigger landslides during periods of extreme precipitation.
Abstract. Paleotemperature reconstructions are essential for distinguishing anthropogenic climate change from natural variability. An emerging method in paleolimnology is the use of branched glycerol dialkyl glycerol tetraethers (brGDGTs) in sediments to reconstruct temperature, but their application is hindered by a limited understanding of their sources, seasonal production, and transport. Here, we report seasonally resolved measurements of brGDGT production in the water column, in catchment soils, and in a sediment core from Basin Pond, a small, deep inland lake in Maine, USA. We find similar brGDGT distributions in both water column and lake sediment samples but the catchment soils have distinct brGDGT distributions suggesting that (1) brGDGTs are produced within the lake and (2) this in situ production dominates the down-core sedimentary signal. Seasonally, depth-resolved measurements indicate that most brGDGT production occurs in late fall, and at intermediate depths (18–30 m) in the water column. We utilize these observations to help interpret a Basin Pond brGDGT-based temperature reconstruction spanning the past 900 years. This record exhibits trends similar to a pollen record from the same site and also to regional and global syntheses of terrestrial temperatures over the last millennium. However, the Basin Pond temperature record shows higher-frequency variability than has previously been captured by such an archive in the northeastern United States, potentially attributed to the North Atlantic Oscillation and volcanic or solar activity. This first brGDGT-based multi-centennial paleoreconstruction from this region contributes to our understanding of the production and fate of brGDGTs in lacustrine systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.