Capturing program and data traces during program execution unobtrusively in real-time is crucial in debugging and testing of cyber-physical systems. However, tracing a complete program unobtrusively is often cost-prohibitive, requiring large on-chip trace buffers and wide trace ports. Whereas program execution traces can be efficiently compressed in hardware, compression of data address and data value traces is much more challenging due to limited redundancy. In this paper we describe two hardwarebased filtering techniques for data traces: cache first-access tracking for load data values and data address filtering using partial register-file replay. The results of our experimental analysis indicate that the proposed filtering techniques can significantly reduce the size of the data traces (~5-20 times for the load data value trace, depending on the data cache size; and ~5 times for the data address trace) at the cost of rather small hardware structures in the trace module.