How AI optimisation affects data centre design
by Colin Ryan
2 days ago
Tate Cantrell. Image: Pawel Swider
Verne Global’s Tate Cantrell discusses the implications that AI has on data centre design, from complex cooling requirements to NATO-level security protocols.
Recently, Meta shared details of its plans for AI advancements, which included an AI-optimised data centre design, stating that the new design will support “liquid-cooled AI hardware and a high-performance AI network connecting thousands of AI chips for data centre-scale AI training clusters”.
The company also stated that the new design will be faster and more cost-effective to build. But how does AI optimisation actually affect the intricacies of data centre design?
To find out, we spoke to Tate Cantrell, chief technology officer at Verne Global, who gave us an insight into the many ways that data centre design changes to accommodate suitable AI operability.
“Data centre design is a complex task of balancing power demand, cooling requirements, high security, extreme levels of reliability and high-speed access to networks,” said Cantrell.
“AI models require a much higher intensity and density of compute, adding a new dimension of complexity to the challenges of traditional data centre design.”
According to Cantrell, some of the core parameters affected by hosting AI models are power source and reliability; the need for higher-density server racks; and scalability.
“A traditional data centre that is not designed for these extreme conditions will be unable to provide reliable performance. Ultimately, the data centre industry needs modified data centre design if it is to keep pace with the demands of AI technology.”
But why do AI models need high density compute? Cantrell said it’s because AI computing requires “extremely low-latency network connections between servers within the data centre”.
“The average rack density a few years ago was 5kW per rack. But the latest generation of AI supercomputers require much more from data centre infrastructure.
“Just four of these systems in one rack could consume more than 40kW while only occupying 60pc of the space of a typical computing rack. So, if data centres are to effectively handle AI hardware, they will need to be capable of this kind of high-density compute.”
According to Cantrell, most conventional data centres are not equipped to deal with the “enormous” compute required to train AI neural networks, particularly in terms of cooling.
While traditional data centres rely on widely spaced server racks to help with cooling, machine learning applications require racks that are placed close together, as it optimises the latency and bandwidth capacity between servers, while minimising the overall cost of deployment.
“To add to the complexity, air-cooled systems that are positioned too closely together can result in cooling deficiencies as the extreme airflow requirements of high-capacity servers can blow against each other and create backpressure on the cooling fans within the equipment,” Cantrell added.
“Data centres must therefore balance the financial pressures of reducing the footprint of the data hall with the need to provide sufficient space for proper cooling.
“This will be one of the reasons we see the increasing adoption of liquid cooling accelerate.”
As if complex cooling requirements wasn’t enough, Cantrell says that data centres will also need to be “structurally capable of handling heavy equipment” to allow the movement of heavy AI computing cabinets, which could weigh more than 1.5 tonnes when fully configured.
‘AI itself also has a role to play when it comes to data centre security’
Cantrell says that the infrastructure of data centres must be able to live up to demands for “connectivity, agility and scalability” in order to be able to house and analyse the increasingly large datasets that AI models are trained on.
“These demands will vary as new applications are developed, old applications are retired, and workloads are adapted to meet the current needs of business.
“What remains the same, is that everything must be designed to provide the optimum performance for any application.”
Cantrell also believes that businesses will begin to “create their own parallel proprietary engines in order to keep their competitive advantage in the marketplace”, adding that companies like Google and Microsoft will compete to operate and train these independent systems while “offering an ecosystem of data sources to pair with a company’s confidential training information”.
“This competition to develop parallel computing resources will drive the industry towards innovations such as knowledge distillation at the algorithm level and innovations such as liquid cooling at the data centre layer.”
In terms of connectivity, Cantrell says that data centres will need “low-latency, east-west spine and leaf networks that can support both production traffic and the machine learning part of AI”.
As well as adapting network infrastructure and internal restructuring, the presence of AI models in data centres introduces unique security considerations.
Cantrell said that it’s vital that data centres “prioritise the physical security of business-critical AI assets”.
“If someone is able to gain access to the model, and more importantly, the weights associated with the model, one could easily gain access to the composition of all of the data that has gone into the training of the AI model.”
Cantrell cited this as a reason as to why Verne Global’s Icelandic data centre campus was built on a highly secure former NATO base.
He added that protecting these AI assets requires multiple layers of security, including “strict identification protocols, multiple challenge points, CCTV, physical patrols and staff on site 24/7, as well as server cages, security racks, and biometric detection”.
“AI itself also has a role to play when it comes to data centre security. The technology can examine, profile and recognise cyberattacks at a speed and scale that is simply not possible for humans.”
10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.
In-Depth: More on Verne Global
Related: energy storage, data centres, data, Verne Global, AI, data storage, Brand Insights
Colin Ryan is a copywriter/copyeditor at Silicon Republic
21 Jul 2023
26 May 2023
27 Mar 2023
17 Jul 2023
27 Jul 2023
10 Jul 2023
1 day ago
1 day ago
1 day ago
1 day ago
1 day ago
1 day ago
1 day ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
3 days ago
3 days ago
3 days ago
Restructure for infrastructure10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.