Computed ultrasound tomography in echo mode (CUTE) is a promising ultrasound (US) based multi-modal technique that allows to image the spatial distribution of speed of sound (SoS) inside tissue using hand-held pulse-echo US. It is based on measuring the phase shift of echoes when detected under varying steering angles. The SoS is then reconstructed using a regularized inversion of a forward model that describes the relation between the SoS and echo phase shift. Promising results were obtained in phantoms when using a Tikhonovtype regularization of the spatial gradient (SG) of SoS. In-vivo, however, clutter and aberration lead to an increased phase noise. In many subjects, this phase noise causes strong artifacts in the SoS image when using the SG regularization. To solve this shortcoming, we propose to use a Bayesian framework for the inverse calculation, which includes a priori statistical properties of the spatial distribution of the SoS to avoid noise-related artifacts in the SoS images. In this study, the a priori model is based on segmenting the B-Mode image. We show in a simulation and phantom study that this approach leads to SoS images that are much more stable against phase noise compared to the SG regularization. In a preliminary in-vivo study, a reproducibility in the range of 10 ms −1 was achieved when imaging the SoS of a volunteer's liver from different scanning locations. These results demonstrate the diagnostic potential of CUTE for example for the staging of fatty liver disease.