Recent research has showed that attenuation images can be determined from emission data, jointly with activity images, up to a scaling constant when utilizing the time-of-flight (TOF) information. We aim to develop practical CT-less joint reconstruction for clinical TOF PET scanners to obtain quantitatively accurate activity and attenuation images. In this work, we present a joint reconstruction of activity and attenuation based on MLAA (maximum likelihood reconstruction of attenuation and activity) with autonomous scaling determination and joint TOF scatter estimation from TOF PET data. Our idea for scaling is to use a selected volume of interest (VOI) in a reconstructed attenuation image with known attenuation, e.g. a liver in patient imaging. First, we construct a unit attenuation medium which has a similar, though not necessarily the same, support to the imaged emission object. All detectable LORs intersecting the unit medium have an attenuation factor of
e
−1
≈ 0.3679, i.e. the line integral of linear attenuation coefficients is one. The scaling factor can then be determined from the difference between the reconstructed attenuation image and the known attenuation within the selected VOI normalized by the unit attenuation medium. A four-step iterative joint reconstruction algorithm is developed. In each iteration, (1) first the activity is updated using TOF OSEM from TOF list-mode data; (2) then the attenuation image is updated using XMLTR—a extended MLTR from non-TOF LOR sinograms; (3) a scaling factor is determined based on the selected VOI and both activity and attenuation images are updated using the estimated scaling; and (4) scatter is estimated using TOF single scatter simulation with the jointly reconstructed activity and attenuation images. The performance of joint reconstruction is studied using simulated data from a generic whole-body clinical TOF PET scanner and a long axial FOV research PET scanner as well as 3D experimental data from the PennPET Explorer scanner. We show that the proposed joint reconstruction with proper autonomous scaling provides low bias results comparable to the reference reconstruction with known attenuation.