We introduce a new methodology based on refinement for testing the functional correctness of hardware and low-level software. Our methodology overcomes several major drawbacks of the de facto testing methodologies used in industry: (1) it is difficult to determine completeness of the properties and tests under consideration (2) defining oracles for tests is expensive and error-prone (3) properties are defined in terms of low-level designs. Our approach compiles a formal refinement conjecture into a runtime check that is performed during simulation. We describe our methodology, discuss algorithmic issues, and provide experimental validation using a 5-stage RISCV pipelined microprocessor and hypervisor.