Why ML Testing Could Be The Future of Data Science Careers
This article predominantly talks about testing as a distinct career option in data science and machine learning (ML). It gives a brief on testing workflows and process. It also depicts the expertise and top-level skills a tester needs to possess in order to test a ML application.
Testing in Data Science: Opportunity for Expansion
There is a significant opportunity to explore and expand the possibilities of testing and quality assurance into the field of data science and machine learning (ML).
Playing around with training data, algorithms and modeling in data science may be a complex yet interesting activity—but testing these applications is no less.
A considerable amount of time goes into testing and quality assurance activities. Experts and researchers believe 20 to 30% of the overall development time is spent testing the application; and 40 to 50% of a project’s total cost is spent on testing.
Moreover, data science experts and practitioners often complain about having ready-for-production data science models, established evaluation criteria and set templates for report generation—but no teams to help them test it. This unleashes the potential of testing in data science as a full-fledged career option.
Testing can be implemented in data science in a totally new context and approach. But, for such systems, this new backdrop consumes even more time, effort and money than the other legacy systems at hand.
To understand this complexity, we first need to understand the mechanics behind machine learning systems
How Machine Learning Systems Work
In machine learning (ML), humans feed desired behavior as examples during the training phase through the training data set and the model optimization process produces the system’s rationale (or logic).
But what lacks is a mechanism to find out if this optimized rationale is going to produce the desired behavior consistently.
This is where testing comes in.
Workflow for Testing Machine Learning Systems
Typically, in machine learning, for a trained model, an evaluation report is automatically produced based on established criteria which includes:
The model’s performance, based on established metrics on the validation dataset. One common metric is the Accuracy or F1- Score, although many others are used as well.
An array of plots, depicting how things like precision-recall curves and AUC-ROC curves perform. This array is, again, not exhaustive. The hyperparameters used to train the model.
Based on the evaluation report, models offering an improvement over the existing model (or baseline) while being executed on the same dataset is promoted and considered for final inclusion.
While reviewing multiple ML models, metrics and plots which summarize model performance over a validation dataset are inspected. Performance between multiple models is compared to make relative judgments—but adequate model behavior cannot be immediately characterized based on this.
Let us take an example to understand.
Case Study: A Hypothetical Data Science Project
Consider a project wherein training data is utilized to develop models. The developed models are tested for performance over a validation dataset and evaluation reports are generated based on accuracy as a metric.
This article originally appeared on techopedia, to read the full article, click here.
Nastel Technologies is the global leader in Integration Infrastructure Management (i2M). It helps companies achieve flawless delivery of digital services powered by integration infrastructure by delivering Middleware Management, Monitoring, Tracking, and Analytics to detect anomalies, accelerate decisions, and enable customers to constantly innovate, to answer business-centric questions, and provide actionable guidance for decision-makers. It is particularly focused on IBM MQ, Apache Kafka, Solace, TIBCO EMS, ACE/IIB and also supports RabbitMQ, ActiveMQ, Blockchain, IOT, DataPower, MFT and many more.
The Nastel i2M Platform provides:
- Secure self-service configuration management with auditing for governance & compliance
- Message management for Application Development, Test, & Support
- Real-time performance monitoring, alerting, and remediation
- Business transaction tracking and IT message tracing
- AIOps and APM
- Automation for CI/CD DevOps
- Analytics for root cause analysis & Management Information (MI)
- Integration with ITSM/SIEM solutions including ServiceNow, Splunk, & AppDynamics