In the future, with the advent of the Internet of Things (IoT), wireless sensors, and multiple 5G applications yet to be developed, an indoor room might be filled with 1000s of devices. These devices will have different Quality of Service (QoS) demands and resource constraints, such as mobility, hardware, and efficiency requirements. The THz band has a massive greenfield spectrum and is envisioned to cater to these dense-indoor deployments. However, THz has multiple caveats, such as high absorption rate, limited coverage range, low transmit power, sensitivity to mobility, and frequent outages, making it challenging to deploy. THz might compel networks to be dependent on additional infrastructure, which might not be profitable for network operators and can even result in inefficient resource utilization for devices demanding low to moderate data rates. Using distributed Device-to-Device (D2D) communication in the THz, we can cater to these ultra-dense low data rate type applications in a constrained resource situation. We propose a 2-Layered distributed D2D model, where devices use coordinated multi-agent reinforcement learning (MARL) to maximize efficiency and user coverage for dense-indoor deployment. We explore the choice of features required to train the algorithms and how it impacts the system efficiency. We show that densification and mobility in a network can be used to further the limited coverage range of THz devices, without the need for extra infrastructure or resources.