Today, data exploration platforms are widely used to assist users in locating interesting objects within large volumes of scientific and business data. In those platforms, users try to make sense of the underlying data space by iteratively posing numerous queries over large databases.Hence, data exploration platforms rely on methods for the extraction of representative data to provide users with a concise and meaningful representation of query results. That is, extracting a few tuples from a query result to provide quick insights in the potentially huge answer space.In the past few years, importance of diversification while extracting representative subsets of data has been greatly emphasized. It has been shown that diverse subsets provide more effective representation of the underlying data by minimizing redundancy and increasing coverage.Meanwhile, search results diversification adds additional cost to an already computationally expensive exploration process.In this PhD thesis, we have focused on the design, implementation and evaluation of scalable diversification algorithms and schemes for the data exploration platforms. Particularly, this research stipulates that extracting diverse representative results during data exploration requires addressing several challenges including: 1) scaling to big volumes of high dimensional data, 2) large number of users, and 3) enabling real time continuous exploration. To address those challenges we focus on two broad aspects: 1) Diversification of high dimensional large data sets and 2) Diversification of multiple user queries.The existing work conducts diversification in two steps: first compute all relevant query results, and then diversify the query results to select a small diverse subset. Similarly, all the dimensional attributes of all the data points in a query result are considered for diversification. Such a generic approach would be a performance bottleneck in high-dimensional large databases. To efficiently compute diverse subsets of query results exhibiting both high dimensionality and high-cardinality, we have proposed the Progressive Diversification Scheme. Our proposed scheme, utilizes the partial distance computations to reduce the amount of CPU and I/O incurred during query diversification. Moreover, to avoid the overhead of computing all relevant results first, we propose embedding diversification in query evaluation step by utilizing column-based data storage systems. In addition to computational cost of diversification methods, we have also considered the complexity of diversity objective function in high dimensional databases. Often, computing diverse solutions along all the dimensions is not a realistic approach. In fact, users may have some pre-specified preferences over some dimensions of the data, while expecting good coverage over the other dimensions. Motivated by that need, we propose a novel scheme, which aims to generate representative data that balance the tradeoff iii between regret minimization and diversity maximization. Our scheme...