The Internet of Things (IoT) requires a new processing paradigm that inherits the scalability of the cloud while minimizing network latency using resources closer to the network edge. On the one hand, building up such flexibility within the edge-to-cloud continuum consisting of a distributed networked ecosystem of heterogeneous computing resources is challenging. On the other hand, IoT traffic dynamics and the rising demand for low-latency services foster the need for minimizing the response time and a balanced service placement. Load-balancing for fog computing becomes a cornerstone for cost-effective system management and operations. This paper studies two optimization objectives and formulates a decentralized load-balancing problem for IoT service placement: (global) IoT workload balance and (local) quality of service (QoS), in terms of minimizing the cost of deadline violation, service deployment, and unhosted services. The proposed solution, EPOS Fog, introduces a decentralized multi-agent system for collective learning that utilizes edge-to-cloud nodes to jointly balance the input workload across the network and minimize the costs involved in service execution. The agents locally generate possible assignments of requests to resources and then cooperatively select an assignment such that their combination maximizes edge utilization while minimizes service execution cost. Extensive experimental evaluation with realistic Google cluster workloads on various networks demonstrates the superior performance of EPOS Fog in terms of workload balance and QoS, compared to approaches such as First Fit and exclusively Cloud-based. The results confirm that EPOS Fog reduces service execution delay up to 25% and the load-balance of network nodes up to 90%. The findings also demonstrate how distributed computational resources on the edge can be utilized more cost-effectively by harvesting collective intelligence.INDEX TERMS Agent, cloud computing, collective learning, distributed optimization, edge computing, fog computing, internet of things (IoT), load-balancing, service placement.