Many emerging applications such as augmented reality, facial recognition, autonomous cars, and e-health require heavy computation, and the processed results have to be available to the user in the order of milliseconds. Edge computing combined with cloud computing can address this challenge by distributing the load (offloading) on different connected computing resources. However, effective task offloading requires an efficient resource management framework. Many existing offloading methodologies consider only latency and energy consumption in a pre-defined network configuration implemented on a small scale. In addition, the effect of the location of the offloading algorithm has not been extensively studied. This thesis introduces a novel adaptive offloading framework using Online Deep Q reinforcement learning. The proposed framework considers strict latency constraints, high state space, rapidly changing user mobility, heterogeneous resources, and stochastic task arrival rate. The proposed research also highlights the importance of caching and introduces a novel concept called "container caching" that caches the dependencies of popular applications. Therefore, offloading decisions are taken to minimize energy consumption, latency, and caching costs. Moreover, the significance of deployment location of the offloading algorithm is also reviewed, and a distributed offloading method is proposed. Extensive simulations in a discrete event simulator implemented in Java using realistic profiles of tasks have been conducted. Simulation results and comparisons with existing benchmarking algorithms showed remarkable performance in terms of energy consumption, network traffic, task failures, remaining power on a large scale demonstrated the feasibility of the proposed approach.