In this paper, we design and analyze strategies to replicate the execution of an application on two di erent platforms subject to failures, using checkpointing on a shared stable storage. We derive the optimal pa ern size W for a periodic checkpointing strategy where both platforms concurrently try and executeW units of work before checkpointing. e rst platform that completes its pa ern takes a checkpoint, and the other platform interrupts its execution to synchronize from that checkpoint. We compare this strategy to a simpler on-failure checkpointing strategy, where a checkpoint is taken by one platform only whenever the other platform encounters a failure. We use rst or second-order approximations to compute overheads and optimal pa ern sizes, and show through extensive simulations that these models are very accurate. e simulations show the usefulness of a secondary platform to reduce execution time, even when the platforms have relatively di erent speeds: in average, over a wide range of scenarios, the overhead is reduced by 30%. e simulations also demonstrate that the periodic checkpointing strategy is globally more e cient, unless platform speeds are quite close.