Background: Depression is prevalent, chronic, and burdensome. Due to limited screening access, depression often remains undiagnosed. Artificial intelligence (AI) models based on spoken responses to interview questions may offer an effective, efficient alternative to other screening methods.Objective: The primary aim was to use a demographically diverse sample to validate an AI model, previously trained on human-administered interviews, on novel bot-administered interviews, and to check for algorithmic biases related to age, sex, race, and ethnicity.Methods: Using the Aiberry app, adults recruited via social media (N = 393) completed a brief bot-administered interview and a depression self-report form. An AI model was used to predict form scores based on interview responses alone. For all meaningful discrepancies between model inference and form score, clinicians performed a masked review to determine which one they preferred.Results: There was strong concurrent validity between the model predictions and raw self-report scores (r = 0.73, MAE = 3.3). 90% of AI predictions either agreed with self-report or with clinical expert opinion when AI contradicted self-report. There was no differential model performance across age, sex, race, or ethnicity.Limitations: Limitations include access restrictions (English-speaking ability and access to smartphone or computer with broadband internet) and potential self-selection of participants more favorably predisposed toward AI technology.Conclusion: The Aiberry model made accurate predictions of depression severity based on remotely collected spoken responses to a bot-administered interview. This study shows promising results for the use of AI as a mental health screening tool.