Defensive deception is emerging to reveal stealthy attackers by presenting intentionally falsified information. To implement it in the increasing dynamic and complex cloud, major concerns remain about the establishment of precise adversarial model and the adaptive decoy placement strategy. However, existing studies do not fulfil both issues because of ( 1) the insufficiency on extracting potential threats in virtualisation technique, (2) the inadequate learning on the agility of target environment, and (3) the lack of measurement for placement strategy. In this study, an optimal defensive deception framework is proposed for the container based-cloud. The System Risk Graph (SRG) is formalised to depict an updatable adversarial model with the automatic orchestration platform. Afterwards, a Deep Reinforcement Learning (DRL) model is trained based on SRG. The well-trained DRL agent generates optimal placement strategies for the orchestration platform to distribute decoys and deceptive routings. Lastly, the coefficient of deception, C, is defined to evaluate the effectiveness of placement strategy. Simulation results show that the proposed method increases C by 30.22%, and increase the detection ratio on the random walker attacker and persistent attacker by 30.69% and 51.10%, respectively.