To thrive in evolving environments, humans are capable of continual acquisition and transfer of new knowledge, from a continuous video stream, with minimal supervisions, while retaining previously learnt experiences. In contrast to human learning, most standard continual learning benchmarks focus on learning from static iid images in fully supervised settings. Here, we examine a more realistic and challenging problem-Label-Efficient Online Continual Object Detection (LEOCOD) in video streams. By addressing this problem, it would greatly benefit many real-world applications with reduced data annotation costs and model retraining time. To tackle this problem, we seek inspirations from complementary learning systems (CLS) in human brains and propose a computational model, dubbed as Efficient-CLS. Functionally correlated with the hippocampus and the neocortex in CLS, Efficient-CLS posits a memory encoding mechanism involving bidirectional interaction between fast and slow learners via synaptic weight transfers and pattern replays. We test Efficient-CLS and competitive baselines in two challenging real-world video stream datasets. Like humans, Efficient-CLS learns to detect new object classes incrementally from a continuous temporal stream of non-repeating video with minimal forgetting. Remarkably, with only 25% annotated video frames, our Efficient-CLS still leads among all comparative models, which are trained with 100% annotations on all video frames. The data and source code will be publicly available at https://github.com/showlab/Efficient-CLS.