Decreasing read cell current (I CELL ) has become a key trend in nonvolatile memory (NVM). This is not only due to device size and V DD scaling while keeping the same threshold voltage (V TH ), but also to the growing spread of the following applications: 1) multiple-level-cell (MLC) [1-2] to achieve smaller area-per-bit; 2) lower-V DD [3] to save power consumption; 3) Logic-process-compatible onetime programming memories (OTP) for embedding into mobile chips. A smaller I CELL leaves the sense amplifiers (SAs) operation vulnerable to 1) bitline (BL) level offset due to noise, bias and load (C BL ) mismatches and 2) V TH variation. As device size and BL-pitch is continually scaled down, the above factors have become major showstopper for SAs. To tolerate these offsets, small-I CELL NVMs suffer from slow read speed or high read fail probability. Thus, a more largely offset tolerant SA is a prerequisite to achieve faster read speeds. In this study, we propose a new offset tolerant current-sampling-based SA (CSB-SA) to achieve 7× faster read speed than previous SAs for sensing small I CELL . A fabricated 90nm 512Kb OTP macro, using the CSB-SA and our CMOS-logiccompatible OTP cell [4], achieves 26ns macro random access time for reading sub-200nA I CELL . Measurements also confirmed that this 90nm CSB-SA could achieve sub-100nA sensing.Many small-I CELL NVMs employ voltage-mode SA (VSA) [2] with a long BL developing time to tolerate SA offset, at the cost of a reduced read speed. Current-mode SA (CSA) achieves faster read speeds than VSA [1]. Cascodecurrent-load or resistive-divider-like CSAs (RD-CSAs) [1], [5], achieve sub100nA sensing, but require long BL settling times to achieve high-accuracy 1 ststage voltage difference. The inverter-offset-compensated SA (IOC-SA) [6] reduces the SA offset. However, BL offset and BL settling time still limits its advantages with regard to VSA/CSA. In comparison with I CELL and a referencecurrent (I REF ), current-mirror CSA (CM-CSA) [7], has fast read speeds but cannot sense small I CELL due to its input-stage V TH mismatch. Figure 11.5.1 compares the concepts of CSB-SA with previous SAs. CSB-SA uses the same MOS device for current sampling and current-ratio amplifying. This enables V THindependent current sampling schemes for its differential I CELL and I REF inputs. This is significantly different from CM-CSA, using different MOS devices for current-mirroring or I-V conveying, which results in increased vulnerability to V THmismatch. In addition, CSB-SA uses sampled current to generate fast 1 st -stage voltage difference at its BL-decoupled small-load internal nodes. Unlike VSA or RD-CSA, which have to develop their 1 st -stage voltage on the heavy-load BL using continuous I CELL driving. IOC alleviates SA V TH -mismatch but with a complex multi-step V TH -nulling process and numerous switching devices. IOC also does not cancel the SA offset due to transistor width/length or T OX variations, and is still vulnerable to BL noise/C BL mismatch. In our CSB-SA, the sampled currents a...