5 Critical Metrics When Deciding What To Automate In AIOps
What are the best ways to apply AIOps in your IT environment? Here are five key metrics to consider.
AIOps – We automate for three benefits: to improve responsiveness, remove drudgery, and deliver consistent results. But automation has consequences, too. As you automate you’re potentially creating technical debt. The automated procedure must be kept up to date whenever you update the systems it automates. If it impacts, say, the network and you change your networking vendor, you’ll have to update the automation and the scripts around it. That’s why it’s important to assess what you need (and don’t need) to automate.
You may wish you could create an all-encompassing automation platform. However, automating reactions to production anomalies may include some major resolution tasks, like a rebuild or recovery of a database. Based on my consulting work, I’ve developed five criteria that I use when working with clients to help them decide what to automate in their IT environments.
Five Criteria for Assessing What to Automate in AIOps
Will it take longer to implement the automation than to respond manually to events?
The straw that broke the camel’s back applies frequently to IT anomalies. A first step in an automation assessment is to identify how often the triggering event or anomaly has, or may, occur. There’s no point in automating the reaction to a one-off event. On the other hand, even though this may be the first time the anomaly has reached a crisis point, it may have occurred before.
When an issue finally comes to your attention—when something breaks—it’s often just the final straw in a series of events, like when a system overloads after coming close many times in prior weeks or months. A query language built into your performance monitor is a powerful support feature, as it allows you to quickly search for times when you came close to an anomaly in the past. Once you know what metrics lead up to the anomaly, you can query to find out how often the event occurs.
Are you automating the solution to a major issue? If the anomaly has an insignificant impact on your overall enterprise, incurring the technical debt of an automated response isn’t the answer. And if the problem is just a temporary slowdown and the response you would automate has high risk, then automation isn’t a go either.
So ask yourself: What’s the cost to the business?
Conversely, if you’re dealing with a dinosaur-extinction type of impact—one that, say, could cost the business millions of dollars in lost sales—you’ll definitely need to automate a response so that your customers never take the hit. In fact, the anomaly will be fixed before your customers are even aware of it. That’s where tracking business transactions will enable you to correlate the business impact with the organizational value.
This article originally appeared on forbes.com To read the full article, click here.
Nastel Technologies uses machine learning to detect anomalies, behavior and sentiment, accelerate decisions, satisfy customers, innovate continuously. To answer business-centric questions and provide actionable guidance for decision-makers, Nastel’s AutoPilot® for Analytics fuses:
- Advanced predictive anomaly detection, Bayesian Classification and other machine learning algorithms
- Raw information handling and analytics speed
- End-to-end business transaction tracking that spans technologies, tiers, and organizations
- Intuitive, easy-to-use data visualizations and dashboards
If you would like to learn more, click here.