Contextualized representations (e.g. ELMo, BERT) have become the default pretrained representations for downstream NLP applications. In some settings, this transition has rendered their static embedding predecessors (e.g. Word2Vec, GloVe) obsolete. As a side-effect, we observe that older interpretability methods for static embeddings -while more mature than those available for their dynamic counterparts -are underutilized in studying newer contextualized representations. Consequently, we introduce simple and fully general methods for converting from contextualized representations to static lookup-table embeddings which we apply to 5 popular pretrained models and 9 sets of pretrained weights. Our analysis of the resulting static embeddings notably reveals that pooling over many contexts significantly improves representational quality under intrinsic evaluation. Complementary to analyzing representational quality, we consider social biases encoded in pretrained representations with respect to gender, race/ethnicity, and religion and find that bias is encoded disparately across pretrained models and internal layers even for models that share the same training data. Concerningly, we find dramatic inconsistencies between social bias estimators for word embeddings.
The recognizer discussed will automatically recognize telephone-quality digits spoken at normal speech rates by a single individual, with an accuracy varying between 97 and 99 percent. After some preliminary analysis of the speech of any individual, the circuit can be adjusted to deliver a similar accuracy on the speech of that individual. The circuit is not, however, in its present configuration, capable of performing equally well on the speech of a series of talkers without recourse to such adjustment. Circuitry involves division of the speech spectrum into two frequency bands, one below and the other above 900 cps. Axis-crossing counts are then individually made of both band energies to determine the frequency of the maximum syllabic rate energy with each band. Simultaneous two-dimensional frequency portrayal is found to possess recognition significance. Standards are then determined, one for each digit of the ten-digit series, and are built into the recognizer as a form of elemental memory. By means of a series of calculations performed automatically on the spoken input digit, a best match type comparison is made with each of the ten standard digit patterns and the digit of best match selected.
This device provides for rapid analysis of short samples of speech and other sounds. It permits direct viewing of the energy-frequency distribution of the sound at instants of time in a two-dimensional pattern, and also over intervals of time as a three-dimensional pattern. Magnetically recorded on a disk at slow speed, and speeded up 200 times on playback, the sample of sound is analyzed rapidly by a broad band high frequency system. The three-dimensional portrayal (in time, frequency and amplitude) shows the whole sound sample, including an interval of about 12 sec., on one cathode-ray tube, together with a movable indication of the point in time at which the “instantaneous” or two-dimensional frequency-amplitude section, appearing on another tube, is taken. This two-dimensional pattern displays amplitude on a decibel scale and covers about a 40 db range. The complete pattern is made up of 190 amplitude values for the 100 to 4000 cycle frequency range so that the pattern is formed by 190 vertical lines. The effective band width of the resolving filter is 45 cycles. The patterns are scanned at a rate of 2/sec. so that a slow phosphor screen is used for viewing. Successive sections approximately 212, 5, 10, 20, or 40 milliseconds apart may be established and photographed automatically. Or for direct viewing manual selection of any desired point of analysis is provided.
This paper describes a method of identifying phonetic elements in speech. The device continuously correlates the measured frequency spectrum of incoming speech with a set of reference spectra, corresponding to the phonetic elements to be recognized. That spectrum producing the best correlation at any time is indicated as the phonetic element present. In the circuit which has been built, the number of frequency bands used in the spectrum analysis was arbitrarily limited to ten, and the number of phonetic elements identified was also limited to ten. This phonetic element recognizer has been incorporated in an improved form of a digit recognizer; its operation will be shown in a short sound motion picture. Other uses of the phonetic element recognizer would be in a voice typewriter or in the narrow band phonetic element transmission system to be described in a companion paper.
In a companion paper, one of the authors discusses the automatic recognition of phonetic elements of speech by a device which continuously correlates the measured frequency spectrum of incoming speech with a set of reference spectra. After such recognition, one can transmit the resultant information over lines of very narrow band width for speech synthesis by a reverse process. In a circuit which has been built to study this method of transmission, the number of phonetic elements which is recognized is limited to ten, thus requiring transmission only of information as to ten on-off signals, between sending and receiving terminals, plus a narrow band channel for transmitting pitch information. This information is used to control hiss and buzz energy sources in a synthesizer which is similar to that of the vocoder. A demonstration of the operation of this system will be given using recordings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.