Multicast is a key networking service, enabling one-shot delivery of information from a source to multiple destinations and is considered underlying for collaborating multimedia applications such as video conferencing, distance learning and other forms of content distributing over Multi-Channel Multi-Radio Wireless Mesh Networks (MCMR WMNs). Multicast protocol as designed in these networks, however, is tightly coupled with the specifics of the nodes' channel-radio associations to realise minimum interference communication. The mainstream of research in WMN multicasting is oriented towards heuristic or meta-heuristic strategies which basically take on a sequential approach to solve the channel assignment and the multicast routing as two disjoint sub-problems. The resulting network configurations would be sub-optimal in this case. It is given that the cross-interaction between the two sub-problems is an effect of the problem's specifications. In this paper, first, we propose a cross-layer mathematical formulation of joint channel assignment and multicast tree construction in MCMR WMNs, which, opposed to the existing schemes guarantees optimal solution. The simulation results demonstrate that our cross-layer design outperforms the Level Channel Assignment (LCA), Multi-Channel Multicast (MCM), the Genetic Algorithm (GA), Simulated Annealing (SA) and the Tabu Search (TS)-based methods proposed by Zeng et al. (2010) and Cheng et al. (2011) respectively, in terms of inter-channel interference. Second, since joint optimisation modelling has been relatively demanding in terms of complexity, we relax the optimality requirement and alternatively explore the option of a layered formulation in which to ensure an optimal solution for each sub-problem. Our alternative design is proved superior to the prior art in terms of interference minimisation too. We conduct an extensive series of simulations to analyse the optimality and complexity of our two design strategies. The overall result of the interference, is our optimality measurement. Also, complexity is evaluated in terms of the memory consumption as well as the required time to solve the multicast problem.