Abstract-Although typical digital circuits are designed so that the clock period satisfies worst-case path delay constraints, the average input excitation often completes computation in less than a clock cycle. Variable latency units (VLUs) allow for improved throughput by allowing one clock cycle for some computations, and two clock cycles for others, using hold logic to differentiate between the two cases. However, they may experience significant throughput losses due to the effects of process variations. We develop a combined presilicon-postsilicon technique for variation-aware VLU design that ensures high throughputs across all manufactured chips. We achieve this by identifying path clusters at the presilicon stage, such that each element of a path cluster is likely to be similarly critical in a manufactured part. We use sensors to determine which path clusters is critical at the postsilicon stage and then activate the appropriate hold logics. Practically, for a small number of path clusters, significant improvements in throughput are achievable. On a set of 32nm PTM-based ISCAS89 circuits, our scheme offers 15.1% throughput enhancements with only 3.3% area overhead.