The edge computing system attracts much more attention and is expected to satisfy ultra-low response time required by emerging IoT applications. Nevertheless, as there were problems on latency such as the emerging traffic requiring very sensitive delay, a new Edge Computing system architecture, namely Home Edge Computing (HEC) supporting these real-time applications has been proposed. HEC is a threelayer architecture made up of HEC servers, which are very close to users, Multi-access Edge Computing (MEC) servers and the central cloud. This paper proposes a solution to solve the problems of latency on HEC servers caused by their limited resources. The increase in the traffic rate creates a long queue on these servers, i.e., a raise in the processing time (delay) for requests. By leveraging, based on clustering and load balancing techniques, we propose a new technique called HEC-Clustering Balance. It allows us to distribute the requests hierarchically on the HEC clusters and another focus of the architecture to avoid congestion on a HEC server to reduce the latency. The results show that HEC-Clustering Balance is more efficient than baseline clustering and load balancing techniques. Thus, compared to the HEC architecture, we reduce the processing time on the HEC servers to 19% and 73% respectively on two experimental scenarios.