Abstract-Visible light communications (VLC) can be adopted in lighting infrastructure to provide ubiquitous wireless access in spaces where light is consumed by humans. Unfortunately, the requirement to provide high-quality diffuse illumination reduces the potential capacity of the free space links and we seek ways to accommodate both the lighting and data rate goals. We investigate the combination of multiple VLC transmissions through spatial multiplexing (SM), a MIMO technique, to meet our data rate goals. Specifically, this paper deals with receiver designs intended to receive and decode increasing numbers of SM/MIMO VLC data streams.Conventional imaging (camera) sensors have been used as VLC receivers; however, they are intended to capture frames at relatively low speeds and their architectures do not translate well for receiving multiple high-rate VLC streams. Thus we consider new techniques to optimize imaging receivers to meet the capacity requirements of multiple SM streams.In this paper, we propose token-based pixel selection (TBPS) for CMOS image sensors as a scalable alternative to mitigate this decrease in sampling rate. We show that in many cases, TBPSbased image sensors sample transmissions several times more frequently than windowing image sensors, yielding higher VLC data rates. Assuming the same reset, integration, and read times, TBPS always performs as well as, and often better than, windowing.