DSM generation from satellite imagery is a long-lasting issue and it has been addressed in several ways over the years; however, expert and users are continuously searching for simpler but accurate and reliable software solutions. One of the latest ones is provided by the commercial software Agisoft Metashape (since version 1.6), previously known as Photoscan, which joins other already available open-source and commercial software tools. The present work aims to quantify the potential of the new Agisoft Metashape satellite processing module, considering that to the best knowledge of the authors, only two papers have been published, but none considering cross-sensor imagery. Here we investigated two different case studies to evaluate the accuracy of the generated DSMs. The first dataset consists of a triplet of Pléiades images acquired over the area of Trento and the Adige valley (Northern Italy), which is characterized by a great variety in terms of geomorphology, land uses and land covers. The second consists of a triplet composed of a WorldView-3 stereo pair and a GeoEye-1 image, acquired over the city of Matera (Southern Italy), one of the oldest settlements in the world, with the worldwide famous area of Sassi and a very rugged morphology in the surroundings. First, we carried out the accuracy assessment using the RPCs supplied by the satellite companies as part of the image metadata. Then, we refined the RPCs with an original independent terrain technique able to supply a new set of RPCs, using a set of GCPs adequately distributed across the regions of interest. The DSMs were generated both in a stereo and multi-view (triplet) configuration. We assessed the accuracy and completeness of these DSMs through a comparison with proper references, i.e., DSMs obtained through LiDAR technology. The impact of the RPC refinement on the DSM accuracy is high, ranging from 20 to 40% in terms of LE90. After the RPC refinement, we achieved an average overall LE90 <5.0 m (Trento) and <4.0 m (Matera) for the stereo configuration, and <5.5 m (Trento) and <4.5 m (Matera) for the multi-view (triplet) configuration, with an increase of completeness in the range 5–15% with respect to stereo pairs. Finally, we analyzed the impact of land cover on the accuracy of the generated DSMs; results for three classes (urban, agricultural, forest and semi-natural areas) are also supplied.