Customer support +49 (40) 59380 500
Mon - Fri 9:00 - 17:00

ScaleUp Blog

  • Home
  • Blog
  • Tier 1 to 4: The four tier classes of the data center

Tier 1 to 4: The four tier classes of the data center

The Uptime Institute, based in the USA, introduced the tier topology as a global standard for classifying data centers at the end of the 1990s. Each "tier" stands for a certain rank that the respective data center or its subsystems fulfill. It is the most commonly used, and sometimes misused, "standard" to describe the structure and availability of a data center.

The tier topology provides for a total of four tiers (Tier 1 through Tier 4), with Tier 1 being the least reliable environment and Tier4 being classified as "highly available."

What does "high availability" mean?

"The term "availability" refers to the probability that a system can actually be used as designed at a given time.
(Guide "Operationally Secure Data Centers, Bitcom, December 2013)

Availability is measured as the ratio of downtime to total system time: Availability = Uptime / (Downtime + Uptime)

or

Availability (%) = 1 - downtime / (production time + downtime)

[mk_blockquote style="quote-style" font_family="none" text_size="12″ align="left" padding="25″]"High availability (also abbreviated to HA, derived from engl. high availability) thus refers to the ability of a system to ensure unrestricted operation in the event of failure of one of its components."

Andrea HeldOracle 10g High Availability

[/mk_blockquote]

For "high availability", the probability of a system being available must be above 99.99%. The annual downtime must therefore be in the range of minutes.

Four-tier topology for data center classification

tier-1 to tier-4, topology uptime institute
Tier topology uptime institute

Tier I

The Tier 1 standard describes a basic infrastructure setup that does not include redundancy and is therefore only suitable for non-critical workloads (loads). Features are simply existing components to supply and distribute the loads. In this setup model, maintenance is only possible on a planned basis and when the systems are completely interrupted. In addition, there is a high risk of failure, because faults can occur due to technical components and human error at all connection and supply levels.

Tier II

In a Tier 2 data center, redundancy is provided by redundant power and cooling components ("N+1" redundancy), such as an additional UPS module, further cooling units, pumps or generators. However, the distribution path for all loads is still single, which is why availability increases but is significantly weaker compared to the following Tier 3 level. In this setup, which is possibly intended as a transitional solution, planned measures and unplanned events can still severely affect the running systems or lead to complete failure.

Tier III

Tier 3 is the most widely used standard. A significant difference to Tier 2 is that "continuous maintenance" is possible in a Tier 3 data center. Maintenance of individual components can be planned without interruption during operation. For the supply and distribution of the load, there is "N+1" redundancy in each case for this purpose (read also our blog article "How does redundant power work in the data center?"). The maximum downtime (average over 5 years) is thus reduced to 1.6h per year.

Tier IV

At the current maximum level Tier 4, fault-tolerant infrastructure elements are added. In this setup, each system element has its own automatic mechanisms that start automatic reaction and defense mechanisms to counteract any error or failure of individual components. All capacity and supply elements are also designed to be redundant throughout, and technically optimally coordinated with each other. Complementary systems and supply paths are operated physically isolated from each other to avoid "contagion risks" in the event of failures. With an average availability of 99.995% per year, a Tier 4 data center is considered "highly available", but its design is very complex and correspondingly expensive to implement.

Fault tolerant systems

Fault-tolerant systems achieve particularly high availability because they can react to almost all conceivable causes of error with the help of intelligent software. In addition, the design of fault-tolerant systems eliminates causes for Single Points of Failure (SPOF). An SPOF refers to a single component that is mandatory for the correct and reliable functioning of the overall system. This also includes the design of the network and the storage technology: For example, a failed network switch can already lead to the service of the overall network no longer being available.

Through the production of Redundancy and automatic load balancing, SPOF risks can be contained. For this purpose, the individual hardware and network components such as routers and switches of the same type are created multiple times. In the event of a failure, the redundant component can take over the task of the others. For particularly high availability requirements, the entire computer hardware can also be mirrored in the form of a standby system.

It should be noted, however, that high availability is not only determined at the physical infrastructure level. The organizational and executive structures are no less crucial for secure operation of the infrastructure. These include, for example:

  • trained service personnel
  • Keeping spare parts ready
  • Conclusion of maintenance contracts
  • Instructions on how to behave in the event of a fault or emergency
  • fast, accurate communication
  • Traceable logging of events

The most important standards that can be certified at the organizational level are ISO/IEC 27001 (standard for Information Security Management Systems, ISMS for short) with reference to IT-Grundschutz and ISO/IEC 20000 (standard for IT service management, or ITSM for short). For the ISO/IEC 27001 and ISO/IEC 20000 standards, a guide with best practice instructions is also available. These are ISO/IEC 27002 and ITIL (IT Infrastructure Library).

Recommended links:

Wolfgang Heinhaus, Ulrike Ostler (2017, June 30) What secures data centers from Tier I to IV? URL: https://www.datacenter-insider.de/was-sichert-rechenzentren-von-tier-i-bis-iv-a-619349/
Andrea Held (2015, October 6) High Availability and Downtime: An Introduction URL:https://www.informatik-aktuell.de/betrieb/verfuegbarkeit/hochverfuegbarkeit-und-downtime-eine-einfuehrung.html
Holger Skurk (2013, December 13) "Operationally secure data center" URL: https://www.bitkom.org/Bitkom/Publikationen/Betriebssicheres-Rechenzentrum.html
Uptime Institute (2018, January) Tier Standard: Topology, Uptime Institute URL: https://uptimeinstitute.com/uptime_assets/d60e4b92ffa912e586a68d76617c4f12c3700681477064382a03207927a1ee96-00001A.pdf

Contact

Are you interested in a consultation or do you have questions about our services?

Where is Open Compute headed? Impressions from OCP Summit 2022

This year, the annual global Open Compute Project Summit took place in San José (CA) from October 18-20. We were there ourselves and want to share some of our impressions here.

Special features of OpenCompute Servers - How OCP hardware differs from classic hardware.

After our introduction to OpenCompute hardware, in this post of our OCP series we want to go into a bit more detail about the design of OpenCompute hardware, specifically the server design and how it differs from classic hardware.

Open Compute Hardware Makes Data Centers Fit for Green IT

Open Compute Hardware (OpenSource Hardware) has enormous potential to sustainably improve resource efficiency in data centers. What is the Open Compute Project (OCP)?

Autoscaling in Kubernetes

One of the great promises of using Kubernetes is the ability to scale infrastructure dynamically, based on user requirements. However, by default, Kubernetes does not automatically add or remove instances. To do this, the Kubernetes...

Stay In Contact

Get the latest information and blog articles on cloud, open compute and sustainable data centers in our bimonthly newsletter.

ScaleUp Technologies