This article proposes a state-dependent routing algorithm based on a global optimization cost function whose parameters are learned from the real-time state of the network with no a priori model. The proposed approach samples, estimates, and builds the model of pertinent and important aspects of the network environment such as type of traffic, QoS policies, resources, etc. It is based on the trial/error paradigm combined with swarm-adaptive approaches. The global system uses a model that combines both a stochastic planned prenavigation for the exploration phase with a deterministic approach for the backward phase. We conducted a performance analysis of the proposed algorithm using OPNET based on several topologies such as the Nippon telephone and telegraph network. The simulation results obtained demonstrate substantial performance improvements over traditional routing approaches as well as the benefits of learning approaches for networks with dynamically changing traffic.