In the world of liberalized power markets traditional power management concepts have come to their limits. Optimal pricing can no longer be achieved, e.g. for very short-time needs across grids. Power line overload and grid stability, increasingly resulting in regional or even global black-outs, are at stake. With the highly desirable expansion of renewable energy production these challenges are experienced in quite an amplified way: We argue that for this emergent technology the traditional top-down and long-term power management is obsolete, due to the wide dispersion and high unpredictability of wind and solar-based power facilities. In the DECENT0 F 1 R&D initiative we developed a multi-level, bottom-up solution where autonomous collaborative software agents negotiate available energy quantities and needs on behalf of consumer and producer groups (the DEZENT algorithm). We operate within very short time intervals of assumedly constant demand and supply, in our case periods of 0.5sec (switching delay for a light bulb). The solution has proven to be secure against a relevant variety of malicious attacks. Within this time interval we are also able to manage the coordinated power distribution, and achieve grid stability. In this paper the main contribution is to make the negotiation strategies themselves adaptive across periods: We derive the dynamic distributed learning algorithm DECOLEARN from Reinforcement Learning principles for providing the agents with collaborative intelligence and at the same time proving substantially superior to conventional (static) procedures. We report briefly on our extensive comparative simulation experiments.