In continual and lifelong learning, good representation learning can help increase performance and reduce sample complexity when learning new tasks. There is evidence that representations do not suffer from "catastrophic forgetting" even in plain continual learning, but little further fact is known about its characteristics. In this paper, we aim to gain more understanding about representation learning in continual learning, especially on the feature forgetting problem. We devise a protocol for evaluating representation in continual learning, and then use it to present an overview of the basic trends of continual representation learning, showing its consistent deficiency and potential issues. To study the feature forgetting problem, we create a synthetic dataset to identify and visualize the prevalence of feature forgetting in neural networks. Finally, we propose a simple technique using gating adapters to mitigate feature forgetting. We conclude by discussing that improving representation learning benefits both old and new tasks in continual learning.Preprint. Under review.