With online social network and mobile devices flourishing, social media is becoming ubiquitous in our daily lives. However, the poor data quality and the diversity of multimedia content are major obstacles to finding desired information for the users. Furthermore, efficient retrieval becomes a non-trivial task due to the massive, and growing amount of media data. Nowadays, social media systems require data cleaning and understanding techniques for intelligent management, as well as efficient and accurate retrieval frameworks to handle the large amount of data. In this thesis, we study different content management and retrieval methodologies to address the above problems.The first part of this thesis aims to improve the data quality of social media by automatically completing the missing information. Specifically, we present a spatial-aware multimodal location estimation framework to predict the unknown location for some media data. Our method consists of multiple models which utilize various information sources, including textual and visual content of the social media.Understanding the visual content plays a crucial role in multimedia retrieval. With the help of deep learning techniques, machines are now able to recognize certain objects from images and videos. In the second part of this thesis, we design a new neural network architecture to automatically learn the discriminative visual features from the visual content of a large-scale annotated dataset. The network is able to capture the fine-grained differences among images.To tackle the large data volume and high feature dimensionality, in the third part of this thesis, we propose an efficient retrieval framework which combines the bag-of-visual-words model and deep learning features. The burstiness issue and the quantization error are revisited and addressed in our scenario. Efficient storage and fast retrieval is achieved by incorporating the inverted index into our framework. Sometimes, the initial retrieval results might not be satisfactory because of the noisy and irrelevant data. Therefore, in the last part of this thesis, we introduce a query adaptive reranking method to further refine the search result. The key idea is to suppress the negative effects of background noise when calculating the importance of result candidates. We formulate the procedure as a quadratic programming problem and solve it with an efficient optimization algorithm.Integrating the all the components in to one system, we are able to accurately predict the location of social media and as a result, we improve the data quality for better content management. The whole system is also capable to provide fast and accurate image retrieval, which is crucial for large-scale applications.iii