A 256 x 256 single photon avalanche diode (SPAD) sensor integrated in a 3D-stacked 90nm 1P4M/40nm 1P8M process is reported for flash light detection and ranging (LIDAR) or high speed direct time of flight (ToF) 3D imaging. The sensor bottom tier is composed of a 64x64 matrix of 36.72 m pitch modular photon processing units which operate from shared 4x4 SPADs at 9.18 m pitch and 51% fill-factor. A 16 x 14-bit counter array integrates photon counts or events to compress data to 31.4 Mbps at 30 fps readout over 8 I/O operating at 100 MHz. The pixel-parallel multi-event TDC approach employs a programmable internal or external clock for 0.56 ns to 560 ns time bin resolution. In conjunction with a perpixel correlator, the power is reduced to less than 100 mW in practical daylight ranging scenarios. Examples of ranging and high speed 3D TOF applications are given. Index Terms-3-D imaging, CMOS, direct time of flight (dTOF), histogramming, image sensor, light detection and ranging (LiDAR), single photon avalanche diodes (SPADs), time-to-digital converter (TDC), TDC sharing architecture, TOF.