Derivation of reduced order representations of dynamical systems requires the modeling of the truncated dynamics on the retained dynamics. In its most general form, this so-called closure model has to account for memory effects. In this work, we present a framework of operator inference to extract the governing dynamics of closure from data in a compact, non-Markovian form. We employ sparse polynomial regression and artificial neural networks to extract the underlying operator. For a special class of non-linear systems, observability of the closure in terms of the resolved dynamics is analyzed and theoretical results are presented on the compactness of the memory. The proposed framework is evaluated on examples consisting of linear to nonlinear systems with and without chaotic dynamics, with an emphasis on predictive performance on unseen data.