Cloud resource provisioning requires examining tasks, dependencies, deadlines, and capacity distribution. Scalability is hindered by incomplete or complex models. Comprehensive models with low-to-moderate QoS are unsuitable for real-time scenarios. This research proposes a Negotiation Aware SLA Model for Resource Provisioning in cloud deployments to address these challenges. In the proposed model, a task-level SLA maximizes resource allocation fairness and incorporates task dependency for correlated task types. This process's new tasks are processed by an efficient hierarchical task clustering process. Priority is assigned to each task. For efficient provisioning, an Elephant Herding Optimization (EHO) model allocates resources to these clusters based on task deadline and make-span levels. The EHO Model suggests a fitness function that shortens the make-span and raises deadline awareness. Q-Learning is used in the VM-aware negotiation framework for capacity tuning and task-shifting to post-process allocated tasks for faster task execution with minimal overhead. Because of these operations, the proposed model outperforms state-of-the-art models in heterogeneous cloud configurations and across multiple task types. The proposed model outperformed existing models in terms of make-span, deadline hit ratio, 9.2% lower computational cycles, 4.9% lower energy consumption, and 5.4% lower computational complexity, making it suitable for large-scale, real-time task scheduling.