Travel behavior study is one of the most complex studies as it governs multi-parameter evaluation at a time and involves collection of huge data pertaining to human behavior in variety of situations (Rastogi and Krishna Rao in Segmentation analysis of commuters accessing transit: Mumbai study. J Transp Eng 135(8):506-515, 2009). Success of travel behavior studies is mostly dependent on healthy data collection. Aim of this paper, is to evaluate and suggest the most suitable database development approach and research strategies which can be adopted for travel behavior studies especially in context of Indian metro cities. It also presents entire survey design procedure adopted for travel mode shift study along with researchers' experience in execution of survey and difficulties faced by field teams. Three level filtration processes are adopted for selection of variables for development of mode shift models. Initially the least important variables were omitted based on variable ranking results. Top ranked seven variables are identified for final survey design, out of which travel time (including transit access time, waiting time and in-vehicle time), travel cost and number of transfers are directly measurable variables while comfort, convenience, safety and security are latent variables. The latent variables are measured using psychometric scale constructed from various latent variable indicators and then converted into latent classes using principal component analysis. Research instrument developed and adopted for the present study was tested through pilot survey and corrected as per data compatibility requirement. Finally 1017 valid responses out of 1500 distributed questionnaires are analyzed to evaluate the effects of enhanced level of hypothetical public transport services on travel mode shift behavior through development of binary logistic regression models. Changes in the model performance with inclusion of latent classes are also evaluated using various statistical tools. This paper includes original work based on primary data from a field survey, findings of which are expected to provide a better understanding of entire database development process and effect of different variables on mode shift behavior in context of Indian metropolitan area.