Increasing use of our biometrics (e.g., fingerprints, faces, or voices) to unlock access to and interact with online services raises concerns about the trade-offs between convenience, privacy, and security. Service providers must authenticate their users, although individuals may wish to maintain privacy and limit the disclosure of sensitive attributes beyond the authentication step, e.g., when interacting with Voice User Interfaces (VUIs). Preserving privacy while performing authentication is challenging, particularly where adversaries can use biometric data to train transformation tools (e.g., 'deepfaked' speech) and use the faked output to defeat existing authentication systems. In this paper, we take a step towards understanding security and privacy requirements to establish the threat and defense boundaries. We introduce a secure, flexible privacypreserving system to capture and store an on-device fingerprint of the users' raw signals (i.e., voice) for authentication instead of sending/sharing the raw biometric signals. We then analyze this fingerprint using different predictors, each evaluating its legitimacy from a different perspective (e.g., target identity claim, spoofing attempt, and liveness). We fuse multiple predictors' decisions to make a final decision on whether the user input is legitimate or not. Validating legitimate users yields an accuracy rate of 98.68% after cross-validation using our verification technique. The pipeline runs in tens of milliseconds when tested on a CPU and a single-core ARM processor, without specialized hardware.