The Ground Truth Challenge
One of the key challenges for a Data Science team is the search for an accurately labelled dataset for solving the given problem. While it is easy to build a basic model that is reasonably accurate for a demo to the business, going beyond it towards a production worthy solution needs gold standard ground truth data.
One of the easiest options available for a data science team is a crowdsourced annotation service. Most often your Information Security team is not going to allow you to share the information on the Internet! Even if you manage to find a crowd service provider who can handle this, it becomes increasingly difficult to get good results from crowdsourced operations if the expectations on accuracy is very high or the task itself is relatively complex. This is a situation that demands ‘human touch’ as a fully managed services from an experienced Data enablement company.
Do you wonder what this has to do with ML Ops? While the need for ‘human touch’ services is very clear for ground truthing in AI/ML projects, there is a need for it in other lifecycle stages of the ML pipeline as well!
After the model is trained with the pre-labelled data, it becomes imperative to validate and tune the model with new data that has not been used during model development. Since the data is not labelled, it becomes necessary for a human to validate the output of the model, especially if it is unstructured data like Computer vision, Audio or NLP. Human touch not only validates the performance of the model for an identified metric but helps in labelling of new data for further training of models and fine tuning. After all, a defect identified in a deep learning model implementation does not need a “software defect fix”, but more data!
Once it is fairly validated in the pre-production stage, a model is ready to be deployed in production. But as in any other project, the model now sees data from a relatively unknown domain that may lead to errors in production. But this can prove expensive in a live environment! In order to ensure that the system is working fine even if the prediction by the model is inaccurate, human intervention is most important in production stage to handle exceptions. The model in production can escalate an event to a human if the prediction confidence is low for live validation and correction.
Machine Learning Operations
Valohai is a product that handles Machine orchestration, Version control and Pipeline Management for Deep learning exceedingly well. The need for human touch services is inextricably interlinked with Machine Learning operations. The Data Enablement services from Nextwealth complements the Valohai features very well, leading to a more comprehensive solution to a Data Science team!
Written by Kannan Sundar - Chief Digital Transformation Officer at NextWealth
Responsible for driving the Digital Roadmap for Nextwealth - building capabilities in areas like RPA, AI/ML and workflow implementations. Nextwealth is a Social Entrepreneurship venture with the objective of ‘social uplift through entrepreneurship’. NextWealth pioneered a Distributed Delivery Model involving the setting-up of delivery centres in Tier 2-3 cities of India. Nextwealth is in the business of providing Human Touch to Digital Processes.