This study investigates the efficacy of six major Generative AI (GenAI) text detectors when confronted with machine-generated content modified to evade detection (n = 805). We compare these detectors to assess their reliability in identifying AI-generated text in educational settings, where they are increasingly used to address academic integrity concerns. Results show significant reductions in detector accuracy (17.4%) when faced with simple techniques to manipulate the AI generated content. The varying performances of GenAI tools and detectors indicate they cannot currently be recommended for determining academic integrity violations due to accuracy limitations and the potential for false accusation which undermines inclusive and fair assessment practices. However, these tools may support learning and academic integrity when used non-punitively. This study aims to guide educators and institutions in the critical implementation of AI text detectors in higher education, highlighting the importance of exploring alternatives to maintain inclusivity in the face of emerging technologies.