In the era of large AI models, the intricate architectures and vast parameter sets of models such as large language models (LLMs) present significant challenges for effective AI quality management (AIQM). This paper investigates the quality assurance of a specific LLM-based AI product: ChatGPT-based sentiment analysis. The study focuses on stability issues, examining both the operation and robustness of ChatGPT’s underlying large-scale AI model. Through experimental analysis on benchmark datasets for sentiment analysis, the findings highlight the ChatGPT-based sentiment analysis’s susceptibility to uncertainty, which relates to various operational factors. Furthermore, the study reveals that the ChatGPT-based model faces stability challenges, particularly when confronted with conventional small-text adversarial attacks targeting robustness.