Abstract. Academic researchers often involve in collecting multi-source, multi-scale and multidiscipline data within experimental watersheds, observatories, and research sites. The ability to find and access interested data over diverse data sources through a single environment will be great benefits for researchers. Our work makes two major contributions. Firstly, in order to alleviate semantic heterogeneity and associate semantic information with data retrieval process, we propose a semi-automatic approach to build a high quality water environment domain-specific ontology from different candidate corpus which provides a cyclical process that involves the successive and flexible application of various NLP techniques, statistical algorithms and intelligent data mining approaches for ontology learning and ontology modelling. Secondly, an ontology based water environmental data retrieval system(Onto-WEDR) was developed by using Service Oriented Architectures (SOA) and Rich Internet Applications(RIAs )techniques to provides a centralized and easy-to-use interactive infrastructure which enables users to one-stop search, access, download and visualize water data in a single environment. The feasibility and effectiveness of the Onto-WEDR system prototype was demonstrated through several investigations about water quality data discovery and retrieval at basin scale in China.