In this paper, we analyse performance and energy consumption of five OpenMP runtime systems over a NUMA platform. We also selected three CPU level optimizations, or techniques, to evaluate their impact on the runtime systems: processors features Turbo Boost and C-States, and CPU DVFS through Linux CPUFreq governors. We present an experimental study to characterize OpenMP runtime systems on the three main kernels in dense linear algebra algorithms (Cholesky, LU and QR) in terms of performance and energy consumption. Our experimental results suggest that OpenMP runtime systems can be considered as a new energy leverage, and Turbo Boost, as well as C-States, impacted significantly performance and energy. CPUFreq governors had more impact with Turbo Boost disabled, since both optimizations reduced performance due to CPU thermal limits. A LU factorization with concurrent write extension from libKOMP achieved up to 63% of performance gain and 29% of energy decrease.