In recent years, physical unclonable functions (PUFs) have gained a lot of attention as mechanisms for hardware-rooted device authentication. While the majority of the previously proposed PUFs derive entropy using dedicated circuitry, software PUFs achieve this from existing circuitry in a system. Such software-derived designs are highly desirable for low-power embedded systems as they require no hardware overhead. However, these software PUFs induce considerable processing overheads that hinder their adoption in resource-constrained devices. In this article, we propose DTA-PUF, a novel, software PUF design that exploits the instruction- and data-dependent dynamic timing behaviour of pipelined cores to provide a reliable challenge-response mechanism without requiring any extra hardware. DTA-PUF accepts sequences of instructions as an input challenge and produces an output response based on the manifested timing errors under specific over-clocked settings. To lower the required processing effort, we systematically select instruction sequences that maximise error-rate. The application to a post-layout pipelined floating-point unit, which is implemented in 45 nm process technology, demonstrates the effectiveness and practicability of our PUF design. Finally, DTA-PUF requires up to 50× fewer instructions than existing software processor PUF designs, limiting processing costs and resulting in up to 26% power savings.