Software fault prediction (SFP) has become a pivotal aspect in realm of software quality. Nevertheless, discipline of software quality suffers the starvation of fault datasets. Most of the research endeavors are focused on type of dataset, its granularity, metrics usedand metrics extractors. However, sporadic attention has been exerted on developmentof fault datasets and their associated challenges. There are very few publicly available datasets limiting the possibilities of comprehensive experiments on way to improvising the quality of software. Current research targets to address the challenges pertinent to fault dataset collection and developmentif one is not available publicly. It also considers dynamic identification of available resources such as public dataset, open-source software archieves, metrics parsers and intelligent labeling techniques. A framework for datasetcollection and development process has been furnished alongwith evaluation procedure for the identified resources.