This study investigated implementation strategies to optimize the precision of Tapped Delay Line (TDL) Time-to-Digital Converters (TDCs) designed for Xilinx 20 nm UltraScale Field-Programmable Gate Arrays (FPGAs). This optimization process aims to bridge the performance gap between FPGA-based TDCs, which are more flexible and suitable for fast prototyping, and the better-performing Application-Specific Integrated Circuit (ASIC) solutions, making FPGA-based TDCs viable for cutting-edge applications. Our key areas of focus included the optimal design of the decoder, the degree of sub-interpolation, and the placement of TDLs, with particular emphasis on the clocking distribution scheme within the Configurable Logic Block (CLB) to minimize the effects of Bubble Errors (BEs) and quantization error. The research led to the development and comparison of multiple TDL TDC solutions implemented on a Kintex UltraScale device (i.e., XCKU040-2FFVA1156E) housed on a KCU105 general-purpose Evaluation Board (EVB). From these, two main solutions emerged: one with high precision and one with low area. The first one was characterized by a Single-Shot Precision (SSP) of 2.64 ps r.m.s., and by Differential and Integral Non-Linearity (DNL/INL) Errors of 0.523 ps and 16.939 ps, respectively, occupying 883 CLBs and 126 kb of Block RAM (BRAM). The second one had an SSP of 3.75 ps r.m.s., a DNL of 0.599 ps, and an INL of 7.151 ps, and it occupies only 259 CLBs and 72 kb of BRAM.