There is a current debate about the extent to which ChatGPT, a natural language AI chatbot, can disrupt processes in higher education settings. The chatbot is capable of not only answering queries in a human-like way within seconds but can also provide long tracts of texts which can be in the form of essays, emails, and coding. In this study, in the context of higher education settings, by adopting an experimental design approach, we applied ChatGPT-3 to a traditional form of assessment to determine its capabilities and limitations. Specifically, we tested its ability to produce an essay on a topic of our choice, created a rubric, and assessed the produced work in accordance with the designed rubric. We then evaluated the chatbot’s work by assessing ChatGPT’s application of its rubric according to a modified version of Paul’s (2005) Intellectual Standards rubric. Using Christensen et al.’s (2015) framework on disruptive innovations, our study found that ChatGPT was capable of completing the set tasks competently, quickly, and easily, like a “magic wand”. However, our findings also challenge the extent to which all of the ChatGPT’s demonstrated capabilities can disrupt this traditional form of assessment, given that there are aspects of its construction and evaluation that the technology is not yet able to replicate as a human expert would. These limitations of the chatbot can provide us with an opportunity for addressing vulnerabilities in traditional forms of assessment in higher education that are subject to academic integrity issues posed by this form of AI. We conclude the article with implications for teachers and higher education institutions by urging them to reconsider and revisit their practices when it comes to assessment.