The IBM zEnterprise A system introduced a new and innovative redundant array of independent memory (RAIM) subsystem design as a standard feature on all zEnterprise servers. It protects the server from single-channel errors such as sudden control, bus, buffer, and massive dynamic RAM (DRAM) failures, thus achieving the highest System z A memory availability. This system also introduced innovations such as DRAM and channel marking, as well as a novel dynamic cyclic redundancy code channel marking. This paper describes this RAIM subsystem and other reliability, availability, and serviceability features, including automatic channel error recovery; data and clock interface lane calibration, recovery, and repair; intermittent lane sparing; and specialty engines for maintenance, periodic calibration, power, and power-on controls.
The IBM eServer zSeries Model z990 offers customers significant new opportunity for server growth while preserving and enhancing server availability. The z990 provides vertical growth capability by introducing the concurrent addition of processor/memory books and horizontal growth in channels by the use of extended virtualization technology. In order to continue to support the zSeries legacy for high availability and continuous reliable operation, the z990 delivers significant new features for reliability, availability, and serviceability (RAS). This paper describes these new capabilities, in each case presenting the value of the feature, both in terms of enhancing the self-management capability of the server and its availability.
RAS design for the IBM eServer z900 The IBM eServer zSeries TM Model 900, or z900, has been designed with major enhancements for hardware reliability, availability, and serviceability (RAS) in support of the zSeries RAS strategy, the eServer self-management technologies, and the z900 design objective of continuous reliable operation. The eServer self-management technologies enable the server to protect itself, to detect and recover from errors, to change and configure itself, and to optimize itself, in the presence of problems and changes, for maximum performance with minimum outside intervention. From the RAS perspective, the longstanding RAS strategy for the IBM S/390 ® and now the zSeries has provided an excellent foundation for self management. This paper describes the z900 RAS enhancements and how they strengthen the RAS strategy building blocks and provide a basis for autonomic computing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.