Computacenter plc

04/23/2025 | News release | Archived content

State of DevOps-Prompts: Evaluating AI Tools for Kubernetes

AI in Data Centers: Designing Infrastructure for Innovation

Read More

In today's digital age, artificial intelligence (AI) has evolved from a buzzword to a transformative force, reshaping industries. Building customised AI solutions demands specialised infrastructures tailored to their unique needs, particularly in compute, storage, and network areas, as well as the data centres housing these systems.

AI applications require robust compute resources with powerful processors and GPUs to handle extensive volumes of data and complex algorithms. Efficient storage solutions are crucial for quick access to large datasets, while high-bandwidth, low-latency network infrastructures ensure seamless communication between components.

Optimised data centres are essential for providing physical security and an ideal operating environment for AI hardware. Together, these elements enable companies to fully harness AI technologies in their data centres or edge locations, fostering innovation. While this article focuses on AI infrastructures in data centres, public cloud offerings are also viable. The choice hinges more on the operating model for resource provision and where the data resides than on technological factors.


Data Centres: The Physical Foundation

Given the high energy consumption of AI systems, designing energy-efficient data centres is crucial. The physical infrastructure of a data centre must be robust to support the unique needs of AI systems. This includes high-performance cooling systems for heat dissipation and security measures to protect sensitive data. The high-power requirements and rack densities mean some existing data centre facilities are just not capable of handling the required AI workloads and the use of modular or containerised facilities may need to be a consideration.

Innovative cooling solutions, such as liquid cooling, help manage high temperatures. Although not new, liquid cooling is not widely used in most data centres today and often requires retrofitting. (Read more about Direct Liquid Cooling in our recent blog article)

Besides cooling, other facilities like emergency power supplies, power distribution, and physical security are vital. Data centres must be safeguarded against physical threats such as break-ins, fires, and natural disasters, necessitating surveillance systems, access controls, and fire protection measures.

High availability and reliability are also essential, requiring redundant power supplies and emergency generators to prevent outages.

Location is also a key consideration. Factors like climate conditions, energy supply, and proximity to key data sources can impact the efficiency and security of the infrastructure.

Overall, the physical infrastructure of a data centre demands careful planning and continuous maintenance to meet the high demands of AI systems. Below are the key considerations when determining if a data centre is AI appropriate:

  • Power: Is sufficient, stable power capacity available? Are renewables mandated?
  • Cooling: Can the climate and infrastructure meet the requirements of liquid cooling?
  • Latency: Do you have proximity to the required data, cloud services or edge platforms?
  • Resilience: Have you assessed vulnerability to system outages, environmental disruptions, or geopolitical risks.

Compute: The Power Behind AI

Implementing AI infrastructures can be challenging, especially regarding server hardware. AI models demand immense computing power. An important aspect to remember is the rapid development of new processor models. These models offer increased performance in the shortest possible time, which means that the infrastructure has to be replaced in faster cycles than with classic IT. Powerful processors are essential for efficiently processing large datasets, but not all AI requires the power of GPUs and traditional high-end CPUs are still a consideration:

  • CPU (Central Processing Unit) - multi-purpose processor for a wide range of tasks
  • GPU (Graphical Processing Unit) - crucial for parallel calculations, outperforming conventional CPUs.
  • TPU (Tensor Processing Unit) - highly efficient for training and inference in deep learning models.

Here's a general guide on processor selection:


Storage: Meeting the Demands of Intelligent Systems

AI applications generate and utilise extensive amounts of data, necessitating both high capacity and speed for storage. NVMe (Non-Volatile Memory Express) storage offers rapid data access, while tape drives are valuable for archiving large datasets. Efficient data management solutions ensure relevant data is readily available, with data lakes and networks enabling large-scale storage of structured and unstructured data.

Balancing cost and performance is a major challenge. High-performance solutions like NVMe are costly, whereas more affordable options like traditional hard drives and tapes may lack the required speed. Companies must carefully evaluate which storage solutions best meet their specific needs.

Scalability is another critical factor. As data volumes grow exponentially, storage solutions must expand flexibly without compromising performance. This requires well-planned, scalable storage architectures. Reliability is also crucial, as data loss or corruption can have severe consequences, especially in critical applications. Robust backup and recovery solutions are essential to maintain data integrity.

Energy efficiency is an additional challenge, as high-performance storage systems consume significant energy. Implementing efficient hardware solutions and sustainable energy sources is vital for IT departments.

So, when evaluating storage systems for AI, remember:

  • Performance - don't create a bottleneck!
  • Scale - can it grow to accommodate the data sets?
  • Cost - ensure the storage is efficient to balance performance and scale

Network: Connecting the Components

A high-performance network is the backbone of any AI infrastructure, enabling quick and reliable data transfer between systems. High-bandwidth networks are essential for optimising data transfer between storage systems and computing units. AI systems often involve numerous processing units, especially GPUs, placing significant demands on network infrastructure for 'east-west traffic,' which may require hundreds or thousands of high-speed, low-latency ports.

  • AI networks require high speed (100 Gbit/s, 400 Gbit/s, and 800 Gbit/s)

    Network reliability is critical. Failures or interruptions can cause significant delays and data loss, which is unacceptable in many applications.

  • Robust network architectures and redundancy mechanisms are essential for continuous availability.

    Security is another key aspect, as AI applications often handle sensitive data. Comprehensive security measures, including data encryption during transmission, firewalls, and intrusion detection systems, are necessary to protect against unauthorized access and cyberattacks.

  • AI data is often sensitive - ensure robust protection is in place

    Finally, network scalability is vital. As data volumes grow and AI models become more complex, the network must adapt flexibly to increasing demands. This requires careful planning and continuous monitoring of network performance.

  • Software-defined networking (SDN) allows flexible adaptation of network capacities to changing needs.

Summary

In the fast-moving and innovative world of AI, the rapid integration of various complex technologies is crucial. Computacenter helps businesses keep pace with the latest technological advances and get the maximum performance out of AI applications. Our Rapid Data Center Deployment Services offers an ideal solution to make this integration efficient and effective. By rapidly deploying data center infrastructure, companies can seamlessly integrate their AI applications, increasing their innovation and competitiveness.

Ready to lead the AI revolution? Let's innovate together.

Thomas Munser

Thomas Munser, Solution Manager Data Center

Latest Blog articles Nieuwste blogartikelen Neueste Blog-Artikel Derniers articles de blog

AI in Data Centers: Designing Infrastructure for Innovation

Apr 23, 2025 by Thomas Munser
Read More

State of DevOps-Prompts: Evaluating AI Tools for Kubernetes

Apr 01, 2025 by Norbert Steiner, Tony Nutzmann, Yuriy Lesyuk,
Read More

From System Administrator to Developer

Mar 03, 2025 by Norbert Steiner
Read More

Direct Liquid Cooling: The new gold standard for data centers

Feb 14, 2025 by Ulf Schade
Read More
Computacenter plc published this content on April 23, 2025, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on May 07, 2025 at 10:48 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at support@pubt.io