Applications requiring detection of small visual contrast require high sensitivity. Event cameras can provide higher dynamic range (DR) and reduce data rate and latency, but most existing event cameras have limited sensitivity. This paper presents the results of a 180-nm Towerjazz CIS process vision sensor called SDAVIS192. It outputs temporal contrast dynamic vision sensor (DVS) events and conventional active pixel sensor frames. The SDAVIS192 improves on previous DAVIS sensors with higher sensitivity for temporal contrast. The temporal contrast thresholds can be set down to 1% for negative changes in logarithmic intensity (OFF events) and down to 3.5% for positive changes (ON events). The achievement is possible through the adoption of an in-pixel preamplification stage. This preamplifier reduces the effective intrascene DR of the sensor (70 dB for OFF and 50 dB for ON), but an automated operating region control allows up to at least 110-dB DR for OFF events. A second contribution of this paper is the development of characterization methodology for measuring DVS event detection thresholds by incorporating a measure of signal-to-noise ratio (SNR). At average SNR of 30 dB, the DVS temporal contrast threshold fixed pattern noise is measured to be 0.3%-0.8% temporal contrast. Results comparing monochrome and RGBW color filter array DVS events are presented. The higher sensitivity of SDAVIS192 make this sensor potentially useful for calcium imaging, as shown in a recording from cultured neurons expressing calcium sensitive green fluorescent protein GCaMP6f.
This paper reports a study on methods for real-time speaker identification using the output from an event-based silicon cochlea. These methods are evaluated based on the amount of computation that needs to be performed and the classification performance in a speaker identification task. It uses the binaural AEREAR2 silicon cochlea, with 64 frequency channels and 512 output neurons. Auditory features representing fading histograms of inter-spike intervals and channel activity distributions are extracted from the cochlea spikes. These feature vectors are then classified by a linear Support Vector Machine, which is trained against a subset of 40 speakers (20/20 male/female) from the TIMIT database. Speakers are correctly identified at >90% accuracy during each sentence utterance and with an average latency of 700±200ms from the start of the sentence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.