We present partial likelihood (PL) as an effective means for developing nonlinear techniques for signal processing. Posing signal processing problems in a likelihood setting provides a number of advantages, such as allowing the use of powerful tools in statistics and easy incorporation of model order/complexity selection into the problem by use of appropriate information-theoretic criteria. However, likelihood formulations in most time series applications require a mechanism to discount the dependence structure of the data. We address how PL bypasses this requirement and note that it might coincide with conditional likelihood in a number of cases. We show that PL theory can also be used to establish the fundamental information-theoretic connection, to show the equivalence of likelihood maximization and relative entropy minimization without making the assumption of independent observations, which is an unrealistic assumption for most signal processing applications. We show that this equivalence is true for the basic class of probability models (the exponential family), which includes many important structures that can be used as nonlinear filters. We conclude by giving examples of the application of PL theory.