Background and Objective: Depression is a substantial public health issue, with global ramifications. While initial literature reviews explored the intersection between artificial intelligence (AI) and mental health, they have not yet critically assessed the specific contributions of Large Language Models (LLMs) in this domain. The objective of this systematic review was to examine the usefulness of LLMs in diagnosing and managing depression, as well as to investigate their incorporation into clinical practice. Methods: This review was based on a thorough search of the PubMed, Embase, Web of Science, and Scopus databases for the period January 2018 through March 2024. The search used PROSPERO and adhered to PRISMA guidelines. Original research articles, preprints, and conference papers were included, while non-English and non-research publications were excluded. Data extraction was standardized, and the risk of bias was evaluated using the ROBINS-I, QUADAS-2, and PROBAST tools. Results: Our review included 34 studies that focused on the application of LLMs in detecting and classifying depression through clinical data and social media texts. LLMs such as RoBERTa and BERT demonstrated high effectiveness, particularly in early detection and symptom classification. Nevertheless, the integration of LLMs into clinical practice is in its nascent stage, with ongoing concerns about data privacy and ethical implications. Conclusion: LLMs exhibit significant potential for transforming strategies for diagnosing and treating depression. Nonetheless, full integration of LLMs into clinical practice requires rigorous testing, ethical considerations, and enhanced privacy measures to ensure their safe and effective use.