Title: Machine Learning Techniques for Building a Large Scale Production Ready Classifier

Year of Publication: Jun - 2015
Page Numbers: 1-16
Authors: Arthi Venkataraman
Conference Name: The Second International Conference on Data Mining, Internet Computing, and Big Data (BigData2015)
- Mauritius


This paper brings out the various techniques we have followed to build a production ready scalable classifier system to classify the tickets raised by employees of an organization. The end users raise the tickets in Natural language which is then automatically classified by the classifier. This is a practical applied research paper in the area of machine learning. We have applied different machine learning techniques like active learning for improving the accuracy of the prediction and have used clustering for handling the data issues found in the training data. The approach we used for the core classifier combined the results of multiple machine learning algorithms using suitable scoring techniques. Use of this system has given more than 50% improvement in the tickets re-assignment index and more than 80% accuracy has been achieved in correctly identifying the classes for the tickets. The system is able to perform at scale, has response times well within the expectations and handles the peak load. Key takeaways from this paper include: How to build live production ready classifier system How to overcome the data related challenges while building such a system Solution architecture for the classifier system Deployment architecture for the classifier system Being prepared for the kind of post deployment challenges one can face for such a system Benefits of building such a system include Improved Productivity, improved End user experience and quick turnaround time.