We consider the problem of finding an optimal packet transmission policy that minimizes the total cost of transmitting M data packets from a source S to two receivers R1, R2 over half-duplex, erasure channels. The source can either broadcast random linear network coding (RLNC) packets to the receivers or transmit using unicast sessions at each time slot. We assume that the receivers can share their knowledge with each other by sending RLNC packets using unicast transmissions. We model this problem by using a Markov Decision Process (MDP), where the actions include the source of and type of transmission to be used in a given time slot given perfect knowledge of the system state. We study the distribution of actions selected by the MDP in terms of the knowledge at the receivers, the channel erasure probabilities, and the ratio between the cost of broadcast and unicast. This allowed us to learn from the optimal policy and devise two simple, yet powerful heuristics that are useful in practice. Our heuristics rely on different levels of feedback, namely, sending 1 or 2 feedback packets per receiver per M data packets by choosing the right moment to send this feedback. Our numerical results show that our heuristics are able to achieve the same performance of the MDP solution.