Modern processors often suffer from inefficient resource utilization, which leads to inferior performance and energy efficiency. This dissertation scrutinizes the utilization of datapath and cache resources in superscalar processors for opportunities to improve performance and energy efficiency.Traditional superscalar processors usually employ a one-size-fits-all design approach that allocates a fixed amount of resources for all applications at all times to deliver the best overall performance. However, the one-size-fits-all approach is not always energy efficient, because both the application behavior and the use scenario are changing all the time and the demand for processor resources is also changing accordingly.To improve the utilization of datapath resources, this dissertation proposes an adaptive processor that dynamically allocates datapath resources based on the needs of applications and use scenarios. The adaptive processor is applied to two use cases to improve energy efficiency. In the first use case (front-end throttling (FET)), the adaptive processor dynamically throttles the frontend instruction delivery bandwidth as program behavior changes to optimize a target metric, being performance, energy, or an arbitrary trade-off between them. In the second use case (dynamic core scaling (DCS)), the adaptive processor extends performance-energy tradeoff capabilities in superscalar processors by scaling datapath resource rather than voltage. The adaptive processor ensures that programs run at a given percentage of their maximum speed and, at the same time, minimizes energy consumption by dynamically adjusting the active superscalar datapath resources. DCS is more effective in performance-energy tradeoffs than DVFS at the high performance end. When used together with DVFS, DCS significantly extends the range of performance-energy tradeoffs.Caches also suffer from inefficient utilization in modern processors. To minimize the access iv v latency of set-associative caches, the data in all ways are read out in parallel with the tag lookup.However, this is energy inefficient, as only the data from the matching way is used and the others are discarded. To improve the utilization of the L1 instruction cache, this dissertation proposes an early tag lookup (ETL) technique for L1 instruction caches that determines the matching way one cycle earlier than the cache access, so that only the matching data way need to be accessed. ETL incurs no performance penalty and insignificant hardware overhead, but dramatically reduces the read energy of L1 instruction cache.For memory intensive workloads, caches often suffer from thrashing, i.e., high-reuse blocks evicting each other from the cache due to the lack of space. To reduce thrashing, only a fraction of the working set should be kept in the cache, so that at least this fraction stays longer in the cache to enable reuse before eviction. However, prior insertion policies take an ad hoc approach to selecting that fraction, e.g., inserting blocks with high priority at ...