Recovery of three-dimensional (3D) coordinates using a set of images with texture mapping to generate a 3D mesh has been of great interest in computer graphics and 3D imaging applications. This work aims to propose an approach to adaptive view selection (AVS) that determines the optimal number of images to generate the synthesis result using the 3D mesh and textures in terms of computational complexity and image quality (peak signal-to-noise ratio (PSNR)). All 25 images were acquired by a set of cameras in a 5×5 array structure, and rectification had already been performed. To generate the mesh, depth map extraction was carried out by calculating the disparity between the matched feature points. Synthesis was performed by fully exploiting the content included in the images followed by texture mapping. Both the 2D colored images and grey-scale depth images were synthesized based on the geometric relationship between the images, and to this end, three-dimensional synthesis was performed with a smaller number of images, which was less than 25. This work determines the optimal number of images that sufficiently provides a reliable 3D extended view by generating a mesh and image textures. The optimal number of images contributes to an efficient system for 3D view generation that reduces the computational complexity while preserving the quality of the result in terms of the PSNR. To substantiate the proposed approach, experimental results are provided.