We explore unsupervised pre-training for speech recognition by learning representations of raw audio. wav2vec is trained on large amounts of unlabeled audio data and the resulting representations are then used to improve acoustic model training. We pre-train a simple multi-layer convolutional neural network optimized via a noise contrastive binary classification task. Our experiments on WSJ reduce WER of a strong character-based log-mel filterbank baseline by up to 36 % when only a few hours of transcribed data is available. Our approach achieves 2.43 % WER on the nov92 test set. This outperforms Deep Speech 2, the best reported character-based system in the literature while using two orders of magnitude less labeled training data. 1
Highly ordered silver nanowire arrays have been obtained by pulsed electrodeposition in self-ordered porous alumina templates. Homogeneous filling of all the pores of the alumina template is achieved. The interwire distance is about 110 nm corresponding to a density of silver nanowires of 61ϫ10 9 in. Ϫ2 and the diameter can be varied between 30 and 70 nm. The silver wires are monocrystalline with some twin lamella defects and grow perpendicular to the ͗110͘ direction. The previously encountered difficulty to obtain 100% filling of the alumina pores is discussed in the framework of electrostatic instabilities taking into account the different potential contributions during electrodeposition. To obtain homogeneously filled pore membranes, a highly conductive metal containing electrolyte, a homogeneous aluminum oxide barrier layer, and pulsed electrodeposition are a prerequisite.
Estimating the pose of multiple animals is a challenging computer vision problem: frequent interactions cause occlusions and complicate the association of detected keypoints to the correct individuals, as well as having highly similar looking animals that interact more closely than in typical multi-human scenarios. To take up this challenge, we build on DeepLabCut, an open-source pose estimation toolbox, and provide high-performance animal assembly and tracking—features required for multi-animal scenarios. Furthermore, we integrate the ability to predict an animal’s identity to assist tracking (in case of occlusions). We illustrate the power of this framework with four datasets varying in complexity, which we release to serve as a benchmark for future algorithm development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.