Modern architectures track memory accesses using page granularity metadata such as access and dirty bits, leading to fundamental tradeos for system software that uses this metadata. Larger page sizes reduce address translation overheads and page table footprints. However, coarse metadata bits for larger pages limit software's visibility into application-level memory usage, resulting in memory bloat and performance pathologies. As DRAM capacity continues to expand, we expect software to react by aggressively mapping with larger page sizes, making this tradeo space more challenging to navigate. We study the relationship between metadata granularity and delity, the degree to which metadata correctly approximates actual access patterns. We focus on 2MB page support on x86-64 and GPUs, measuring delity across a wide range of benchmarks. Fidelity can be poor at a coarse granularity, and high variance occurs even within applications. To address this problem, we propose P, which provides architectural support for variable-granularity access and dirty bits. Evaluation of Linux/x86-64 and GPU prototypes of P show modest additional hardware can reduce metadata delity loss by up to 65% and 55% at a performance cost of less than 1% and 2% on CPUs and GPUs respectively. We show that the recovered delity can eliminate performance pathologies and improve the performance of GPGPU applications using demand paging by 29.8% on average. CCS CONCEPTS • Software and its engineering ! Virtual memory; • Computer systems organization ! Parallel architectures.