Use other services for machine learning

  • 4/11/2018

Chapter summary

In this chapter we have set aside Azure Machine Learning to discuss other alternatives when generating models and processing large amounts of data. Among all that has been reviewed, some key points stand out:

  • You have reviewed CNTK, a Microsoft deep learning library that allows you to efficiently create and train deep models. The acceleration provided is mainly due to the parallelization of operations using GPUs. In addition, self-differentiation allows you to optimize models without the need of differential calculus.

  • Throughout the chapter you have reviewed how to create virtual machines in Azure. The Deep Learning Virtual Machines, in addition to GPUs, has a number of pre-installed tools for data science and deep learning. The Data Science Virtual Machine does not focus so much on deep learning and has tools commonly used in the data science community.

  • You have seen how to clone solutions from others to speed up your machine learning developments or share your own using the Cortana Intelligence Gallery.

  • The main features of the clusters have been listed and you have seen which Azure HDInsight cluster type to create to tackle different analytics workloads that may arise in your business.

  • You have reviewed how to use a cluster with Spark SQL for exploratory data analysis, with MLLib to create machine learning models using Spark, with Mahout to train a recommender using MapReduces, and with R Server to build machine learning models.

  • In addition to R server over a cluster, you have used R server (Machine Learning Services) in a Data Science Virtual Machine. You configured additional options within your Virtual Machine and its SQL server instance that allowed you to execute R (and Python) scripts remotely from a TSQL query in a remote database client of your choosing.

  • With the correct execution of R-TSQL scripts, you experimented with R data types, understood the mechanics behind the R-TSQL procedure call, and then trained and stored a linear model and predicted future data based on the eruptions of the Old Faithful Geyser in Yellowstone National Park.