This work introduces a new maximum entropy language model that decomposes the model parameters into a low rank component that learns regularities in the training data and a sparse component that learns exceptions (e.g. multiword expressions). The low rank component corresponds to a continuous-space language model. This model generalizes the standard 1regularized maximum entropy model, and has an efficient accelerated first-order training algorithm. In conversational speech language modeling experiments, we see perplexity reductions of 2-5%.