Ali Heydari scite author profile

Shahi

Radmard

et al. 2022

Increasing demands for cloud-based computing and storage, Internet-of-Things, and machine learning-based applications have necessitated the utilization of more efficient cooling technologies. Direct-to-chip liquid cooling using cold plates has proven to be one of the most efficient methods to dissipate the high heat fluxes of modern high-power CPUs and GPUs. While the published literature has well-documented research on the thermal aspects of direct liquid cooling, a detailed account of transient hydraulic investigation is still missing. In this experiment, a total of four 52U racks with four high-power TTV-servers (Thermal Test Vehicles) in each rack were designed and deployed. Each server consists of eight GPU TTVs and six NV switch heaters. Each of the two racks has a different vendor rack manifold and cooling loop modules (CLM). A 450 kW coolant distribution unit (CDU) is used to supply 25% propylene glycol coolant to these racks. Each rack has its own rack-level flow control valve to maintain the same flow rate. The present study provides an in-depth analysis of hydraulic transients when rack-level flow control valves are used with and without flow control. The operating conditions of the CDU are varied for different parameters, such as a constant flow rate, constant differential pressure, and constant pump speed. Furthermore, hydraulic transient is examined when the cooling loop modules are decommissioned from the rack one by one. The effect of this step-by-step decommissioning is assessed on the CDU operation and other racks. The pressure drop-based control strategy has been developed to maintain the same flow rate in the remaining servers in the rack when some cooling loop modules are decommissioned.

Refrigerant to Air Cooling for High Heat Density Two-Phase Cooled Data Centers

Manaserh

Mehrabi

et al. 2023

Determination of the Thermal Performance Limits for Single Phase Liquid Cooling Using an Improved Effectiveness-NTU Cold Plate Model

Ortega

Caceres

Uras

et al. 2022

Cold plates are at the heart of pumped liquid cooling systems. In this paper, we report on combined experimental, analytical, and computational efforts to characterize and model the thermal performance of advanced cold plates in order to establish their performance limits. A novel effectiveness-NTU formulation is introduced that models the fin array as a secondary “pseudo-fluid” such that accurate crossflow effectiveness models can be utilized to model the cold plates using well-known formulations. Experimental measurements and conjugate CFD simulations were made on cold plates with fin and channel features of order 100 um with water-propylene glycol (PG) mixtures as coolants. We show that for a fixed fin geometry, the best thermal performance, regardless of the pressure drop, is achieved when the flow rate is high enough to approach the low NTU convective limit which occurs for NTU approaching zero. For the model cold plate evaluated in this study, the lowest thermal resistance achieved at a flow rate of 4 LPM was 0.01 C/W, and the convective limit was 0.005 C/W. However, for a fixed pressure drop, the optimal cold plate should be designed to meet its TDP at the highest possible effectiveness in which the lower limit of thermal resistance is the advective limit achieved for NTU > 7. For the tested cold plate the advective limit for the thermal resistance is 0.003 C/W, but this limit can only be achieved if it is practically feasible to increase the surface area and heat transfer coefficient to maximize NTU for a targeted TDP.

Liquid to Air Cooling for High Heat Density Liquid Cooled Data Centers

Radmard

Eslami

et al. 2022

Growing demand for dense and high-performing IT compute capacity to support deep learning and artificial intelligence workloads necessitates data centers to look for more robust thermal management strategies. Today, data centers across the world are turning to liquid-based cooling solutions to keep up with the increased cooling demand for high power racks approaching 100kW of heat dissipation. Deploying direct-to-chip cold plate liquid cooling is one of the mainstream approaches which allows targeted cooling of high-power processors. This study provides the framework for a hybrid in row cooler (IRC) with liquid-to-air (L2A) heat exchanger (HX) system delivering chilled coolant to liquid-cooling cold plates mounted to the high heat dissipation electronics. This approach is useful for high heat density cooling of racks where no primary facility coolant is available at the data center. The present study aims to investigate the thermo-hydraulic performance of a distinct L2A IRC system that supplies cold secondary coolant (PG 25%) into the cooling loops of liquid-cooled servers in racks within an existing air-cooled data center. Thermal test vehicles (TTVs) are built to replicate actual high heat density servers. From the cold plate to data center level the proper choice of each level component was described based on their cooling performance and relevance. Three different cooling loop/rack designs are characterized experimentally, and detailed analytical and numerical (FNM) simulations are developed to analyze the heat exchanger performance. The FNM and CFD model of a data center are done in two steady and transient forms to study the performance of the L2A IRC in a data center.

A Control Strategy for Minimizing Temperature Fluctuations in High Power Liquid to Liquid CDUs Operated at Very Low Heat Loads

Shahi

Radmard

et al. 2022

The rising demand for high-performance central and graphical processing units has resulted in the need for more efficient thermal management techniques like direct-to-chip liquid cooling. Direct Liquid Cooling using cold plates is one of the most efficient and investigated cooling technologies since the 1980s. Major data and cloud providers are actively deploying liquid-cooled data center infrastructure due to rising computational demands. Liquid to liquid heat exchangers used in liquid-cooled data centers is also referred to as coolant distribution units (CDUs). Most of these CDUs selected by the data center operator is based on the heat load of the data center and the available head with that CDU. In this study, three 52U racks with six high-power TTV-based servers (Thermal Test Vehicles) in each rack were designed and deployed. Each server consists of eight GPU TTVs and six NV switch heaters. A 450-kW liquid-cooled CDU is used, and propylene glycol 25% is used as a coolant. Typical CDUs are designed to operate at 20 to 30% of the rated heat load to achieve a stable secondary coolant supply temperature. The present study will investigate the operations of CDU at very low heat loads, like 1% to 10% of the CDU’s rated capacity. At these low loads, large fluctuations in secondary side supply temperature were observed. This large fluctuation can lead to the failure of the 3-way valve used in CDUs at the primary side. In this paper, a control strategy is developed to stabilize the secondary supply temperature within ± 0.5 °C at very low loads using the combination of a flow control valve on the primary side and PID control settings within the CDU.