BackgroundThe Tomosynthesis Mammography Imaging Screening Trial (TMIST), EA1151 conducted by the Eastern Cooperative Oncology Group (ECOG)/American College of Radiology Imaging Network (ACRIN) is a randomized clinical trial designed to assess the effectiveness for breast cancer screening of digital breast tomosynthesis (TM) compared to digital mammography (DM). Equipment from multiple vendors is being used in the study.PurposeFor the findings of the study to be valid and capture the true capacities of the two technology types, it is important that all equipment is operated within appropriate parameters with regard to image quality and dose. A harmonized QC program was established by a core physics team. Since there are over 120 trial sites, a centralized, automated QC program was chosen as the most practical design. This report presents results of the weekly QC testing program. A companion paper will review quality monitoring based on data from the headers of the patient images.MethodsStudy images are collected centrally after de‐identification using the “TRIAD” application developed by ACR. The core physics team devised and implemented a minimal set of quality control (QC) tests to evaluate the tomosynthesis and 2D mammography systems. Weekly, monthly and annual testing is performed by the site mammography technologists with images submitted directly to the physics core. The weekly physics QC tests are described: SDNR of a low‐contrast mass object, artifact spread, spatial resolution, tracking of technical factors, and in‐slice noise power spectra.ResultsAs of December 31, 2022 (5 years), 145 sites with 411 machines had submitted QC data. A total of 136 742 TMIST participant screening imaging studies had been performed. The 5th and 95th percentile mean glandular doses for a single tomosynthesis exposure to a 4.0 cm thick PMMA phantom (“standard breast phantom”) were 1.24 and 1.68 mGy respectively. The largest sources of QC non‐conformance were: operator error, not following the QC protocol exactly, unreported software updates and preventive maintenance activities that affected QC setpoints. Noise power spectra were measured, however, standardization of performance targets across machine types and software revisions was difficult. Nevertheless, for each machine type, test measurement results were very consistent when the protocol was followed. Deviations in test results were mostly related to software and hardware changes.ConclusionMost systems performed very consistently. Although this is a harmonized program using identical phantoms and testing protocols, it is not appropriate to apply universal threshold or target metrics across the machine types because the systems have different non‐linear reconstruction algorithms and image display filters. It was found to be more useful to assess pass/fail criteria in terms of relative deviations from baseline values established when a system is first characterized and after equipment is changed. Generally, systems which needed repair failed suddenly, but in retrospect, for a few cases, drops in SDNR and increases in mAs were observed prior to tube failure.TMIST is registered as NCT03233191 by Clinicaltrials.gov