Multithreaded processors, by simultaneously using both the thread-level parallelism and the instruction-level parallelism of applications, achieve larger instruction per cycle rates than single-thread processors. On a multithread workload, a clustered organization maximizes performance. On a single-thread workload, however, all but one of the clusters are idle, degrading single-thread performance significantly.Using a clustered multithreaded processor optimized for multi-thread performance as a baseline, we propose and analyze several mechanisms and policies to improve single-thread execution exploiting the existing hardware without a significant multi-thread performance loss. We focus on the fetch unit, which is maybe the most peiformance-critical stage. Essentially, we analyze three ways of exploiting the idle fetch clusters: allowing a single thread accessing its neighbor clusters, use the idle fetch clusters to provide multiple-path execution, or use them to widen the effective single-thread fetch block.