In this paper, a novel Deep Q‐learning in addition with an extended Kalman filter (EKF) is proposed to solve the channel and power allocation issue for a device‐to‐device enabled cellular network, when the prior traffic information is not known to the base station. Furthermore, this paper work explores an optimal policy for resource and power allocation between users with the aim of maximizing the sum‐rate of the overall system. The proposed work comprises of four phases, ie, cell splitting, clustering, queuing model, channel allocation, and power allocation, simultaneously. The implementation of cell splitting with novel K‐means++ clustering technique increases the network coverage, reduces co‐channel cell interference, and minimizes the transmission power of nodes, whereas the M/M/C:N queuing model solves the issue of waiting time for users in a priority based data transmission. The difficulty with the Q‐learning and Deep Q‐learning environment is to achieve an optimal policy. This is because of the uncertainty of various parameters associated with the system, especially when the state space is extremely big. In order to improve the robustness of the learner, EKF together with the Deep Q‐network is presented in this paper, which incorporates weight uncertainty of the Q‐function as well as the state uncertainty during the transition. Furthermore, the use of EKF provides an improved version of the loss function that helps the learner in achieving an optimal policy. Through numerical simulation, the advantage of our resource sharing scheme over the other existing schemes is also verified.