Let’s use DevOps thinking to make machine learning fit easier into your it environment
How Machine Learning Ops (MLOps) drives ML model business usage.
DevOps – That useful approach is MLOps, a proven method for improving collaboration and communication between data scientists and IT professionals to help better manage the ML production lifecycle. Yup, the clue’s in the name: just as DevOps has cemented Agile and rapid app and software development based on what the business wants, MLOps could do the same for bridging the gap between complexity and production.The heart of the problem is as ML gets more mainstream, we’re creating more and better models (the outputs of applying ML to your data), but they’re not getting to the finish line in the business yet; Gartner’s Erick Brethenoux estimated last year that less than half (47 percent) are going into production.Two teams aren’t working together as seamlessly as they could.That’s not satisfactory. As Machine Learning has started to gain traction, the speed at which data scientists (and, increasingly, citizen data scientists) can build models through AutoML technology has begun to increase, but if the models don’t get used, ML’s not adding value.
What’s the hold-up? Time was the bottleneck, as was getting hold of data and storing it for ML use, but Big Data and systems like Hadoop solved that for us. That moved the issue to discovering patterns out of that data, and we’ve started to solve that problem by building more and more Machine Learning models through AutoML. But now, we’ve just moved the latency to allowing the business to gain value from these models by putting them into production. And even when our models get deployed, the speed that it takes an organization to move it into production can be weeks or even months.
Our analysis suggests that the reason may be that when models are operationalized, MLOps is performed by the data science team. You have the data scientists who tend to build or create the models, and then you have a production IT team that then looks to deploy and manage them. However, the two teams aren’t working together as seamlessly as they could, because:
They have completely different mindsets. Data scientists tend to be creative and like to be experimental. They work in an R&D culture, so they don’t like process, they don’t follow a structure, and they’re very focused on just building the best model possible. IT plays by different rules, because they have responsibility for production systems, and need to make things work; they want to make sure that the systems that they’re building are robust and available 24/7 and ideally conform to standards and processes. When you put a model into production, the data scientist has created it, but IT has got to manage it, and they’re not always as aligned with each other’s approach as they should be.
Secondly, each team has different competencies. Your data science team is focused on the tools needed to build a great, accurate model; they don’t care as much about production environments and code requirements. IT understands production environments and what software can be used in production, but they don’t know the intricate details of how an ML model gets built. They don’t really understand what a Machine Learning algorithm is, nor do they have a clear view of what a Machine Learning language is, such as Python or R.
Lack of problem ownership
Data science teams are now under more and more demand from different parts of the organization to solve more and more business problems. There’s a significant constraint on people resource, and it’s not efficient for them to be spending their time understanding IT systems because you’ve already got an IT team doing that. Equally, it doesn’t make sense for your IT team to understand the nuts and bolts of a Machine Learning model. What we want to do is to avoid any kind of “Let’s build a model and throw it over the wall to IT, and they will just sort it out” type of thinking.
Finally, you have a lack of ownership of the problem. When the data science team is building a credit risk model for the risk team, the risk team is the customer, the data science team is building the model, but the IT team is deploying it, so each of those three teams are responsible for solving the problem. There’s also the area of governance, too; so when you put a model into production, and it’s making regular decisions every second, every minute, every hour about the business process, it’s now become mission-critical, so you need to ensure this asset is governed fully and only certain people have access to those production models. Only selected people should see it and understand how it’s working, but as you add and retire models, you need to make sure that you’re tracking the changes that are going on. Who made the change? When? How often is that change happening? Tracing that process for compliance, but also troubleshooting it is vital.
Against this set of challenges, MLOps is the best way to better scale and govern Machine Learning activity. It allows data science teams and IT to collaborate, and it will enable your IT operations team, now, really, your model ops team, to centrally manage the everyday operations needed to keep models healthy, keep them running and ensure they’re performing.
With this approach, you would only really let the data science team step in when there’s a severe problem with the model, allowing them to focus on building more and more models. You get your new ‘model ops’ team to quickly deploy and manage models, scaling the whole process.
MLOps makes sense because there are very similar operational processes, even with the most sophisticated and complex Machine Learning model; testing it to make sure it’s performing in a specific way, and then deploying it into production. These are all disciplines IT teams are very used to and are proficient in.
This article originally appeared on itproportal.com To read the full article and see the images, click here.
Nastel Technologies helps companies achieve flawless delivery of digital services powered by middleware. Nastel delivers Middleware Management, Monitoring, Tracking and Analytics to detect anomalies, accelerate decisions, and enable customers to constantly innovate. To answer business-centric questions and provide actionable guidance for decision-makers, Nastel’s Navigator X fuses:
- Advanced predictive anomaly detection, Bayesian Classification and other machine learning algorithms
- Raw information handling and analytics speed
- End-to-end business transaction tracking that spans technologies, tiers, and organizations
- Intuitive, easy-to-use data visualizations and dashboards
Nastel Technologies is the global leader in Integration Infrastructure Management (i2M). It helps companies achieve flawless delivery of digital services powered by integration infrastructure by delivering tools for Middleware Management, Monitoring, Tracking, and Analytics to detect anomalies, accelerate decisions, and enable customers to constantly innovate, to answer business-centric questions, and provide actionable guidance for decision-makers. It is particularly focused on IBM MQ, Apache Kafka, Solace, TIBCO EMS, ACE/IIB and also supports RabbitMQ, ActiveMQ, Blockchain, IOT, DataPower, MFT, IBM Cloud Pak for Integration and many more.
The Nastel i2M Platform provides:
- Secure self-service configuration management with auditing for governance & compliance
- Message management for Application Development, Test, & Support
- Real-time performance monitoring, alerting, and remediation
- Business transaction tracking and IT message tracing
- AIOps and APM
- Automation for CI/CD DevOps
- Analytics for root cause analysis & Management Information (MI)
- Integration with ITSM/SIEM solutions including ServiceNow, Splunk, & AppDynamics