Contact Us
SaaS Log InXRay Login
ITOps

Outages ITOps professionals are thankful to avoid

Nastel Technologies®
November 29, 2022

As we settle into the time of year when we reflect on what we’re thankful for, we tend to focus on important basics such as health, family and friends. But on a professional level, IT operations (ITOps) practitioners are thankful to avoid disastrous outages that can cause confusion, frustration, lost revenue and damaged reputations. The very last thing ITOps, network operations center (NOC) or site reliability engineering (SRE) teams want while eating their turkey and enjoying time with family is to get paged about an outage. These can be extremely costly — $12,913 per minute, in fact, and up to $1.5 million per hour for larger organizations.

 

To understand the peace of mind that comes with avoiding downtime, however, you have to have endured the pain and anxiety that comes with outages first-hand. Here are a handful of the horror stories ITOps pros are thankful to avoid this season.

 

A case of janky command structure

 

One longtime IT pro was on a shift with three others as 7 p.m. rolled around. The crew received an alert about a problem impacting the front-end user interface for its global traffic manager device. Thankfully, there was a runbook for it housed in a database, so it appeared the problem would be resolved quickly. One of the team members saw two things to type in: A command and a secondary input. He typed in the commands and, based on the way the runbook looked, was waiting for the command line to ask for an input, such as “what do you want to restart?”

 

The way the command structure was set up, if you didn’t provide an input, the device itself would restart. He typed in what he thought was the correct command — “bigstart, restart” — and the entire front-end global traffic manager was taken down.

 

Just as a reminder, this took place in the early evening. The customer was a finance company, and the system went down just around the time when businesses were closing and trying to do their books and other finance-related tasks. Terrible timing, to say the least.

Five minutes into the outage, the ITOps team realized what happened: The tool they used for their runbook used text wrapping by default, so what looked like two separate commands was actually just one. Even though the outage was relatively short, it came at a critical time and created a chain reaction of headaches. The lesson learned? Ensure your command structure is optimized.

 

 

This article originally appeared on codeopinion.com. To read the full article, click here.

Nastel Technologies is the global leader in Integration Infrastructure Management (i2M). It helps companies achieve flawless delivery of digital services powered by integration infrastructure by delivering tools for Middleware Management, Monitoring, Tracking, and Analytics to detect anomalies, accelerate decisions, and enable customers to constantly innovate, to answer business-centric questions, and provide actionable guidance for decision-makers. It is particularly focused on IBM MQ, Apache Kafka, Solace, TIBCO EMS, ACE/IIB and also supports RabbitMQ, ActiveMQ, Blockchain, IOT, DataPower, MFT, IBM Cloud Pak for Integration and many more.

 

The Nastel i2M Platform provides:

Comments

Write a comment
Leave a Reply
Your email address will not be published. Required fields are marked *
Comment * This field is required!
First name * This field is required!
Email * Please, enter valid email address!

Schedule your Meeting

 


Schedule your Meeting


Subscribe

Schedule a Meeting to Learn More