Early warning of long-term hospitalization in schizophrenia (SCZ) patients at the time of admission is crucial for effective resource allocation and individual treatment planning. In this study, we developed a deep learning model that integrates demographic, behavioral, and blood test data from admission to forecast extended hospital stays using a retrospective cohort. By utilizing language models (LMs), our developed algorithm efficiently extracts 95% of the unstructured electronic health record data needed for this work, while ensuring data privacy and low error rate. This paradigm has also been demonstrated to have significant advantages in reducing potential discrimination and erroneous dependencies. By utilizing multimodal features, our deep learning model achieved a classification accuracy of 0.81 and an AUC of 0.9. Key risk factors identified included advanced age, longer disease duration, and blood markers such as elevated neutrophil-to-lymphocyte ratio, lower lymphocyte percentage, and reduced albumin levels, validated through comprehensive interpretability analyses and ablation studies. The inclusion of multimodal data significantly improved prediction performance, with demographic variables alone achieving an accuracy of 0.73, which increased to 0.81 with the addition of behavioral and blood test data. Our approach outperformed traditional machine learning methods, which were less effective in predicting long-term stays. This study demonstrates the potential of integrating diverse data types for enhanced predictive accuracy in mental health care, providing a robust framework for early intervention and personalized treatment in schizophrenia management.