Just like the third rail carrying 600 Volts of Direct Current death on the subway tracks, there is a third rail in IT – information Technology. A subject that if touched creates nothing but pain for everyone involved.
But just like that third rail on the train tracks, there are times when it must be discussed, however dangerous that may be.
The third rail in IT is simply that the complexity of IT systems exceeds the ability of the monitoring systems to effectively provide useful information in a timely manner.
That doesn’t sound so bad, but it accounts for a huge part of IT spend and peoples time, and anyone that expounds such an idea within their organization will quickly be marked as a troublemaker.
Most IT shops have spent many man years building their monitoring solutions, and yes, I use the pleural, because it’s very normal to have disparate systems to monitor each part of IT. And this leads to the true complexity and cost concern, because if each of your teams is just measuring the performance of each sub-system, but no one is measuring the performance in totality, then the first person who knows about a performance problem is often the customer. And that never ends well.
Customers either suck it up and accept poor performance or they report it to support or their account manager. If they accept poor performance, then they may end up being someone else’s customer sooner than you would like.
If they choose to report the issue, then this becomes a problem for your whole team, which have to become “crime scene investigators” to dig through the entire user experience to discover why there is an issue and take action to both resolve it and ensure it doesn’t happen again. This will involve representatives of each department across your entire IT ecosystem collecting data from each of their respective monitoring and reporting systems and presenting these as a meeting to investigate the situation. These meetings are often called war rooms, but I spoke with a general manager of a large enterprise recently who had a more apt name, she called them “goat fests”, because the combined sounds of lots of experts all talking over each other to explain why they were not the cause of the issue, was reminiscent of a herd of goats bleating.
War Room meetings are possibly the most painful meeting you will ever attend. They can take many hours or even days and involve an inordinate number of people. And yet they represent the best process that existing performance monitoring can deliver.
But there is something still worse than a war room meeting, and that is the meetings that follow to ensure that the problem once identified and resolved doesn’t happen again. Some industries use the term CAPA to title these meetings. CAPA stands for Corrective Action Preventative Action, and they are critical to improve processes, but also to mitigate responsibility. No one wants to be responsible for the problem, and so writing a thick, technical report can allow the responsibly to be spread around in ways that ensure no one has to lose their job. It may sound quite political, and in many situations that’s exactly what it is.
Solving War Rooms
What War Rooms and CAPA meetings really accomplish is to explain why after spending many millions on systems to monitor a business applications performance, and many more millions on experts to run these systems, the core reasons for all this continuing investment has not been achieved.
For many companies the goals given to Operations is to ensure that problems occur less frequently over time, and that any problem that does occur is resolved faster than ever before. The most common measures of these goals are MTTR and MTBF (Mean Time to Repair and Mean Time Between Failures).
Today’s IT environments utilize a mix of legacy technology (including mainframes and unix systems), which have to work with on-premise virtualized environments as well as cloud environments that are a mix of private, public and even hybrid environments and also interface with leading-edge ideas such as blockchain and IOT actuators and sensors. These environments take the minds of many thousands of experts to implement, so it’s not surprising that when problems occur, they are hard to solve.
But being hard to solve is not an excuse, it never has been, and it never will be.
Hard problems need innovation to overcome.
And that brings be back around to the third rail.
Imagine if there was a simple way to sit on the shoulder of a user’s request and follow it through the whole of your IT environment, taking note of every system that it touches and how much effort it takes each system to process the user’s need. Even if a user’s request were to require multiple systems to handle different elements in parallel and eventually all the elements of the request were coalesced back into a number of responses that could be viewed as the user experienced it.
Imagine that a simple way of analytically visualizing the transaction that a user made, with all the relevant performance data from the entire environment overlaid on the transaction. Now you would be able to quickly review exactly where any issue took place, even if the issue was across many systems.
All of a sudden, the idea of the third rail wouldn’t be so scary. You would be able to spot problems, even complex ones easily, avoiding war rooms and CAPA meetings. You could even use machine learning to compare the progress of each transaction and predict the likelihood of an issue based on historical data allowing for predictive alerts and actions to be taken before a user was ever to see a performance issue. And you could do all of this without drama, even without needing to rip and replace existing processes, actually using what already exists and supplementing it in ways that the business can manage.
This is not just possible, it’s in production with many of the most innovative companies in the world, and it’s based on the products of just one company Nastel Technologies.
Today Nastel Technologies is the only company that exploits the contents of messaging middleware messages to abstract business understanding from them and use this to monitor and analyze application performance.
Nastel Understands that there is a deeply political undercurrent that pervades the IT and commercial arms of every enterprise, a siloed mentality that bleeds efficiency, and effectiveness and causes massive underperformance and overspend. We address this issue head-on and provide a viable solution to reduce costs, improve performance and literally reinvent large parts of the IT Operations process.
When we get in front of decision makers, they have an epiphany that changes their outlook to such an extent that we win deals against the largest, most entrenched competitors continually.
Is Nastel Technology the answer to your specific third rail issue? Ask us, we’d love to have that chat.
Nastel Technologies is the global leader in Integration Infrastructure Management (i2M). It helps companies achieve flawless delivery of digital services powered by integration infrastructure by delivering tools for Middleware Management, Monitoring, Tracking, and Analytics to detect anomalies, accelerate decisions, and enable customers to constantly innovate, to answer business-centric questions, and provide actionable guidance for decision-makers. It is particularly focused on IBM MQ, Apache Kafka, Solace, TIBCO EMS, ACE/IIB and also supports RabbitMQ, ActiveMQ, Blockchain, IOT, DataPower, MFT, IBM Cloud Pak for Integration and many more.
The Nastel i2M Platform provides:
- Secure self-service configuration management with auditing for governance & compliance
- Message management for Application Development, Test, & Support
- Real-time performance monitoring, alerting, and remediation
- Business transaction tracking and IT message tracing
- AIOps and APM
- Automation for CI/CD DevOps
- Analytics for root cause analysis & Management Information (MI)
- Integration with ITSM/SIEM solutions including ServiceNow, Splunk, & AppDynamics