In today’s environment, an enormous amount of unstructured data is generated in an exponential manner. Understanding such complex unstructured data is imperative in the applications including analysis of social media data, image and video data, sensor data, medical data, and customer review data. Generally, clustering is a well-accepted model in classifying and analyzing such documents. An effective and efficient text clustering technique can significantly improve the task of document analysis and grouping with minimum human intervention. The main two factors in text clustering are the text representation model and the clustering algorithm. Firstly, unstructured data must be represented in a structured format for the analysis. Text representation models transform a large volume of text into vector representations by capturing the semantic information. In this paper, we are investigating the text document representation techniques, their limitations, and how text clustering is different from humans understanding of the text data. In this paper, we are aiming to focus on different text representation methods and clustering algorithms to demonstrate how the choice of representation techniques can impact on clustering results.