Nowadays high-performance computing (HPC) architectures are designed to resolve assorted sophisticated scienti¯c as well as engineering problems across an ever intensifying number of HPC and professional workloads. Application and computation of key trigonometric functions sine and cosine are in all spheres of our daily life, yet fairly time consuming task in highperformance numerical simulations. In this paper, we have delivered a detailed deliberation of how the micro-architecture of single-core Itanium r and Alpha 21264/21364 processors as well as the manual optimization techniques improve the computing performance of several mathematical functions. On describing the detailed algorithm and its execution pattern on the processor, we have con¯rmed that the processor micro-architecture side by side manual optimization techniques ameliorate computing performance signi¯cantly as compared to not only the standard math library's built-in functions with compiler optimizing options but also Intel r Itanium r library's highly optimized mathematical functions. J CIRCUIT SYST COMP 2014.23. Downloaded from www.worldscientific.com by UNIVERSITY OF GUELPH on 03/31/15. For personal use only.wave simulations, 11-14 simulation of building oscillations, 15,16 dynamic coe±cient calculation of a bridge, 17 earthquakes simulation, 18 ballistic trajectories, 19°i ght simulation, 20,21 and nuclear physics. 22 Nonetheless, sinðxÞ and cosðxÞ are somewhat computationally intensive functions. For instance, Strebel 23 showed that sinðxÞ and cosðxÞ from the standard math library routines can be computed with 260 nanoseconds using cc compiler and its optimized option Àfast on the Alpha 21164 processor with 500 MHz, whereas his optimized implementation took only 37 ns on the same processor with gcc and its option ÀO1. However, these functions can be computed accurately and e±ciently by calling math library routines which will also deal with exceptional case e.g., input argument x ¼ 0. Yet standard math library functions are often incapable of achieving the demanding performance of HPC in numerical simulations. In e®ect, numerous methods based on either hardware 24-47 or softwares 23,48-50 have been aimed to get the desired level of high-performance computing (HPC) of the key functions.The IA-64 (Intel r Architecture-64) is a unique combination of innovative features e.g., explicit parallelism, predication, speculation, etc. It is designed to be highly scalable to¯ll the increasing performance requirements of various server and workstation market segments. Itanium is a family of 64-bit Intel r processors which implement the Intel r Itanium r architecture (formerly called IA-64). Alpha 21264 (EV-264) microprocessor is a high-performance third-generation implementation of the Compaq Alpha architecture. 51 Alpha 21364 (EV-364) microprocessor is the fourth generation of the Alpha microprocessor family. EV-364 is in many ways similar to A key feature of EV-264/EV-364 is that the°oating-point add pipeline and the°oating-point multiply pipeline are...