In 1960, Rudolf E. Kalman created what is known as the Kalman filter, which is a way to estimate unknown variables from noisy measurements. The algorithm follows the logic that if the previous state of the system is known, it could be used as the best guess for the current state. This information is first applied a priori to any measurement by using it in the underlying dynamics of the system. Second, measurements of the unknown variables are taken. These two pieces of information are taken into account to determine the current state of the system. Bayesian inference is specifically designed to accommodate the problem of updating what we think of the world based on partial or uncertain information. In this paper, we present a derivation of the general Bayesian filter, then adapt it for Markov systems. A simple example is shown for pedagogical purposes. We also show that by using the Kalman assumptions or "constraints", we can arrive at the Kalman filter using the method of maximum (relative) entropy (MrE), which goes beyond Bayesian methods. Finally, we derive a generalized, nonlinear filter using MrE, where the original Kalman Filter is a special case. We further show that the variable relationship can be any function, and thus, approximations, such as the extended Kalman filter, the unscented Kalman filter and other Kalman variants are special cases as well.