Motivated by a greedy approach for generating information stable processes, we prove a universal maximum likelihood (ML) upper bound on the capacities of discrete information stable channels, including the binary erasure channel (BEC), the binary symmetric channel (BSC) and the binary deletion channel (BDC). The bound is derived leveraging a system of equations obtained via the Karush-Kuhn-Tucker conditions. Intriguingly, for some memoryless channels, e.g., the BEC and BSC, the resulting upper bounds are tight and equal to their capacities. For the BDC, the universal upper bound is related to a function counting the number of possible ways that a length-m binary subsequence can be obtained by deleting n − m bits (with n − m close to nd and d denotes the deletion probability) of a length-n binary sequence. To get explicit upper bounds from the universal upper bound, it requires to compute a maximization of the matching functions over a Hamming cube containing all length-n binary sequences. Calculating the maximization exactly is hard. Instead, we provide a combinatorial formula approximating it. Under certain assumptions, several approximations and an explicit upper bound for deletion probability d ≥ 1/2 are derived.