Network slicing and mixed-numerology access schemes cover a central role to enable the flexible multi-service connectivity that characterizes 5G radio access networks (RAN). However, the interference generated by the simultaneous multiplexing of radio slices having heterogeneous subcarrier spacing can hinder the isolation of the different slices sharing the RAN and their effectiveness in meeting the application requirements. To overcome these issues, we design a radio resource allocation scheme that accounts for the inter-numerology interference and maximizes the aggregate network throughput. To overcome the computationally complexity of the optimal formulation, we leverage deep reinforcement learning (DRL) to design an agent capable of approximating the optimal solution exploiting a model-free environment formulation. We propose a multi-branch agent architecture, based on Branching Dueling Q-networks (BDQ), which ensures the agent scalability as the number of spectrum resources and network slices increases. In addition, we augment the agent learning performance by including an action mapping procedure designed to enforce the selection of feasible actions. We compare the agent performance to several benchmarks schemes. Results show that the proposed solution provides a good approximation of the optimal allocation in most scenarios.