Machine Learning: From Practice to Production [REVIEW]

Machine Learning: From Practice to Production [REVIEW]

A review of Ramanan Balakrishnan's insights on the workflow process for AI-oriented products, exploring key questions to consider when launching Machine Learning projects in real production environments.

Gerardo Ortega

Ramanan Balakrishnan wrote an excellent post explaining the workflow process for Artificial Intelligence-oriented products, particularly towards Machine Learning. He raises some questions worth revisiting when we decide to undertake an AI-oriented project with a view to launching said product in a real production environment.

Garbage In, Garbage Out

Do I have a reliable data source? Where do I get my data?

Transforming Data into Inputs

What pre-processing steps are required? How do I normalize my data before using my algorithms?

Now, Shall We Begin?

What language or framework do I use? Python, R, Java, C++? Caffe, Torch, Theano, TensorFlow, DL4J?

Training the Models

How can I train my models? Should I buy GPUs outright or use custom hardware instances in the cloud with EC2? Can I parallelize processing to increase speed?

No System Is an Island

Do I need to make batch or real-time predictions? Implicit models or interfaces? RPC or REST?

Performance Monitoring

How can I track my predictions? How can I log results to a database?

Here’s the image that summarizes the process to follow. Machine Learning from development to production
Gerardo Ortega

About Gerardo Ortega

Software craftsman with a focus on scaling, polyglot programmer, coffee enthusiast, and lifelong learner. Passionate about machine learning, data science, and building great products.