Noisy or distorted video/audio training sets represent constant challenges in automated identification and verification tasks. We propose the method of Mutual Interdependence Analysis (MIA) to extract "mutual features" from a high dimensional training set. Mutual features represent a class of objects through a unique direction in the span of the inputs that minimizes the scatter of the projected samples of the class. They capture invariant properties of the object class and can therefore be used for classification. The effectiveness of our approach is tested on real data from face and speaker recognition problems. We show that "mutual faces" extracted from the Yale database are illumination invariant, and obtain identification error rates of 2.2% in leave-one-out tests for differently illuminated images. Also, "mutual speaker signatures" for text independent speaker verification achieve state-of-theart equal error rates of 6.8% on the NTIMIT database.
Consider a sensing system using a large number of N microphones placed in multiple dimensions to monitor a broadband acoustic field. Using all the microphones at once is impractical because of the amount of data generated. Instead, we choose a subset of D microphones to be active. Specifically, we wish to find the set of D microphones that minimizes the largest interference gain at multiple frequencies while monitoring a target of interest. A direct, combinatorial approach -testing all N choose D subsets of microphones -is impractical because of problem size. Instead, we use a convex optimization technique that induces sparsity through a l1-penalty to determine which subset of microphones to use. We test the robustness of the our solution through simulated annealing and compare its performance against a classical beamformer which maximizes SNR. Since switching from a subset of D microphones to another subset of D microphones at every sample is possible, we construct a space-time-frequency sampling scheme that achieves near optimal performance.
In this paper we continue our treatment of source separation based on dynamic sparse source signal models. Source signals are modeled in frequency domain as a product of a Bernoulli selection variable with a deterministic but unknown spectral amplitude variable. The Bernoulli variable is modeled by a first order Markov process with transition probabilities learned from a training database. We consider a scenario where the mixing parameters are estimated by calibration. We derive the MAP signal estimators and show that the optimization problem reduces to a Belief Propagation Network simulation. We also present preliminary separation performance results using TIMIT database.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.