We present a general model for quantum channels with memory, and show that it is sufficiently general to encompass all causal automata: any quantum process in which outputs up to some time t do not depend on inputs at times t ′ > t can be decomposed into a concatenated memory channel. We then examine and present different physical setups in which channels with memory may be operated for the transfer of (private) classical and quantum information. These include setups in which either the receiver or a malicious third party have control of the initializing memory. We introduce classical and quantum channel capacities for these settings, and give several examples to show that they may or may not coincide. Entropic upper bounds on the various channel capacities are given. For forgetful quantum channels, in which the effect of the initializing memory dies out as time increases, coding theorems are presented to show that these bounds may be saturated. Forgetful quantum channels are shown to be open and dense in the set of quantum memory channels.