Previously only common in high-performance computing (HPC), a growing number of data centers are adopting liquid cooling. The driver — an increasing number of high-density racks being deployed to support artificial intelligence and other latency-sensitive and processing-intensive business applications. These racks regularly exceed 30 kW — past the threshold for air cooling’s operational efficiency.
Engineers tasked with thermal management in data centers who are considering deploying these high-density racks must confront the limits of air as a heat transfer medium or risk reduced efficiency or server failure.
Liquid cooling is the evolution of effective and efficient cooling. Its various configurations offer improved performance and reliability for high-density racks, often at a lower total cost than air cooling systems. Those introducing liquid cooling — especially into an air-cooled or hybrid facility — must navigate design challenges to make a successful transition.
Material compatibility and plumbing design
Concerns about hardware integrity are a common point of resistance to liquid cooling adoption. While the introduction of liquid into the data center must be planned carefully, best practices have emerged to protect sensitive IT equipment and minimize the risk of leaks. Take particular care on potential weak spots like fittings, and when possible, opt for quick-disconnect fittings (which enable serviceability) and shutoff valves (which enable fitting disconnection and leak intervention). Draw on the significant experience that professionals working on material compatibility and fitting design in HPC applications have gained over the past ten years. Open Compute Project’s white paper is a great source of information on this topic.
Leak detection system
A leak detection system, tailored to the organization’s risk tolerance and familiarity with liquid cooling, is another essential component in the design of any liquid cooling system. The most common leak detection system is a direct method in which strategically located sensors or a cable that can detect leaks across the distribution system will trigger alarms when fluids are detected. A less common method is an indirect detection system, in which small changes to pressures and flow across the fluid distribution system are considered indicative of a potential leak.
Whether direct or indirect, a successful leak detection system is one that minimizes false alarms without missing real leaks that require intervention. Intervention can be performed manually or through automated control systems that trigger responses like shutting off liquid flow or powering down servers in racks near the leak.
Cooling distribution units
Establishing a secondary cooling loop in the facility allows precise control of the liquid being distributed to the rack. The key component is the cooling distribution unit (CDU).
CDUs deliver fluid to liquid cooling systems and remove heat from the fluid being used. Using a CDU separates the liquid cooling system from the facility’s water system, which provides more precise control of fluid volumes and pressure. This minimizes the potential impact of any leaks that may occur and can also prevent condensation that could trigger false alarms in leak detection systems by maintaining supply temperature above the data center dew point.
A CDU with a liquid-to-air heat exchanger can simplify deployment in a smaller system as long as the air-cooling system can handle the heat rejected from the CDU. Most CDUs will use a liquid-to-liquid heat exchanger to capture the heat returned from the racks and reject it through the chilled water system. Most CDUs are designed to fit in proximity to the racks they support, though they can be positioned on the perimeter of the data center.
When designing a liquid cooling system, it’s critical to select the appropriate fluid as early in the design process as possible. The type of liquid the system will use depends on many factors, including the system technology. The three most common technologies in use today are rear-door heat exchangers, direct-to-chip cold plates and immersion cooling. Rear-door systems generally use a water/glycol mixture or refrigerant, while direct-to-chip cooling systems can use water, refrigerant, or dielectric fluids. Immersion systems must use dielectric fluids. Dielectric fluids reduce the risk of equipment damage from a fluid leak; however, they are expensive and may present environmental, health and safety concerns.
When selecting the fluid for the system, take into consideration the fluids’ costs, chemical composition, and heat removal capacities, as well as maintenance requirements. The Vertiv white paper, Understanding Data Center Liquid Cooling Options and Infrastructure Requirements, gives details on liquid cooling technologies.
Balancing air and liquid cooling capacity
Engineers are often tasked with adding liquid cooling to an air-cooled data center, creating a hybrid system. A key consideration is how much of the total heat load each system will handle and what new demands may be introduced when liquid is added.
Only immersion cooling systems can operate without support from air cooling. Direct-to-chip cold plates are typically only installed on the main heat-generating components within the rack (CPUs, GPUs and memory), removing between 70% to 80% of the heat generated by the rack, leaving 6 kW to 9 kW of the load for a 30 kW rack to be managed by the air-cooling system. Rear-door heat exchangers remove 100% of the heat load from the rack, but because these systems expel cooled air into the data center through the rear of the rack, the air-cooling system must handle the full heat load of the rack when the rear door is open for maintenance.
Especially in a hybrid system, careful planning is essential when designing and installing plumbing. In raised-floor data centers, liquid cooling pipes must not obstruct airflow, and slab data centers should have piping run over aisles and supported ceiling structures and should include drip pans under fittings to isolate possible leaks. Whether plumbing runs underfloor or overhead, an efficient installation minimizes downtime for existing data center operations. A phased approach may be the best option to ensure the data center is never fully down, prioritizing liquid cooling installation in areas that will expand in the future or by focusing on a single section of racks at a time.
The best plan is planning ahead
More and more data centers are supporting high-density racks, often turning to liquid cooling to regulate server temperatures. Liquid cooling presents new challenges in data center design, but its higher heat-removal capacity makes it a desirable option. Because experience integrating liquid cooling systems in air-cooled facilities is still somewhat limited, the best approach is to work with experienced liquid cooling vendors for the system itself and the supporting infrastructure. Vertiv harnesses years of experience with air and liquid cooling technology to offer the proven solutions and expertise required to successfully implement liquid cooling. Check out these liquid cooling resources to learn more.