Recently, Renes proposed a quantum algorithm called belief propagation with quantum messages (BPQM) for decoding classical data encoded using a binary linear code with tree Tanner graph that is transmitted over a pure-state CQ channel [1]. The algorithm presents a genuine quantum counterpart to decoding based on the classical belief propagation algorithm, which has found wide success in classical coding theory when used in conjunction with LDPC or Turbo codes. More recently Rengaswamy et al. [2] numerically observed that BPQM implements the optimal decoder on a small example code, in that it implements the optimal measurement for distinguishing the quantum output states for the set of input codewords. Here we significantly expand the understanding, formalism, and applicability of the BPQM algorithm with the following contributions. First, we prove analytically that BPQM realizes optimal decoding for any binary linear code with tree Tanner graph. We also provide the first formal description of the BPQM algorithm in full detail and without any ambiguity. In so doing, we identify a key flaw overlooked in the original algorithm and subsequent works which implies quantum circuit realizations will be exponentially large in the code size. Although BPQM passes quantum messages, other information required by the algorithm is processed globally. We remedy this problem by formulating a truly messagepassing algorithm which approximates BPQM and has circuit complexity Oppoly n, polylog 1 ε q, where n is the code length and ε is the approximation error. Finally, we also propose a novel method for extending BPQM to factor graphs containing cycles by making use of approximate cloning. We show some promising numerical results that indicate that BPQM on factor graphs with cycles can significantly outperform the best possible classical decoder.