In this paper, we give a generalized formulation for a vision problem in the framework of modular integration and multiresolution. The developed framework is used to solve the high-level vision problem of scene interpretation. The formulation essentially involves the concept of reductionism and multiresolution, where the given vision task 12 is broken down into simpler subtasks)21,1;2 ..... Vm. Moreover, instead of solving the vision task ~ = 1,' at the finest resolution f2, we solve the synergetically coupled vision subtasks at coarser resolutions l; f2-N for f2 > N > 0 and use the results obtained at resolution (~-N) to solve V ~-N+l, the vision task at resolution (S2-N + 1). Image interpretation is a two-phased analysis problem where some salient features or objects in an image are identified by segmenting the image and the objects in the segmented image are interpreted based on their spatial relationships. We present a solution to the joint segmentation and interpretation problem in the proposed generalized framework. For the interpretation part we exploit the Markov Random Field (MRF) based image interpretation scheme developed by Modestino and Zhang. Experimental results on both indoor and outdoor images are presented to validate the proposed framework.