Machine Learning: From Practice to Production [REVIEW]

Ramanan Balakrishnan wrote an excellent post explaining the workflow process for Artificial Intelligence-oriented products, particularly towards Machine Learning. He raises some questions worth revisiting when we decide to undertake an AI-oriented project with a view to launching said product in a real production environment.

Garbage In, Garbage Out

Do I have a reliable data source? Where do I get my data?

Transforming Data into Inputs

What pre-processing steps are required? How do I normalize my data before using my algorithms?

Now, Shall We Begin?

What language or framework do I use? Python, R, Java, C++? Caffe, Torch, Theano, TensorFlow, DL4J?

Training the Models

How can I train my models? Should I buy GPUs outright or use custom hardware instances in the cloud with EC2? Can I parallelize processing to increase speed?

No System Is an Island

Do I need to make batch or real-time predictions? Implicit models or interfaces? RPC or REST?

Performance Monitoring

How can I track my predictions? How can I log results to a database?

Here’s the image that summarizes the process to follow. Machine Learning from development to production