Machine Learning in Microsoft Azure

Let me begin with a video showing how Machine Learning helps to improve our life.

https://www.youtube.com/watch?t=42&v=SGXTWLQ76jI

The lift is called ThyssenKrupp Elevator, an example of Predictive Maintenance. For more information about it, please read an article about how the system works and the challenges of implementing it on different types of lift.

I first learnt about the term “Machine Learning” when I was taking the online Stanford AI course in 2011. The course basically taught us about the basics of Artificial Intelligence. So, I got the opportunity to learn about Game Theory, object recognition, robotic car, path planning, machine learning, etc.

We learnt stuff like Machine Leaning, Path Planning, AI in the online Stanford AI course.
We learnt stuff like Machine Leaning, Path Planning, AI in the online Stanford AI course.

Meetup in Microsoft

I was very excited to see the announcement from Azure Community Singapore saying that there would be a Big Data expert to talk about Azure Machine Learning in the community monthly meetup.

Doli was telling us story about Azure Machine Learning.
Doli was telling us story about Azure Machine Learning. (Photo Credit: Azure Community Singapore)

The speaker is Doli, Big Data engineer working in Malaysia iProperty Group. He gave us a good introduction to Azure Machine Learning, and then followed by Market Basket Analysis, Regression, and a recommendation system works on Azure Machine Learning.

I found the talk to be interesting, especially for those who want to know more about Big Data and Machine Learning but still new to them. I will try my best to share with you here what I have learned from Doli’s 2-hour presentation.

Ano… What is Machine Learning?

Could we make the computer to learn and behave more intelligently based on the data? For example, is it possible that from both the flight and weather data, we can know which scheduled flights are going to be delayed? Machine Learning makes it possible. Machine Learning takes historical data and make prediction about future trend.

This Sounds Similar to Data Mining

During the meetup, there was a question raised. What is the difference between Data Mining and Machine Learning?

Data Mining is normally carried out by a person to discover the pattern from a massive, complicated dataset. However, Machine Learning can be done without human guidance to predict based on previous patterns and data.

There is a very insightful discussion on Cross Validated that I recommend for those who want to understand more about Data Mining and Machine Learning.

Supervised vs. Unsupervised Learning

Two types of Machine Learning tasks are highlighted in Doli’s talk. Supervised and unsupervised learning.

Machine Learning - Supervised vs Unsupervised Learning
Machine Learning – Supervised vs Unsupervised Learning

In supervised learning, new data is classified based on the training data which are accompanied with labels to help the system to learn by example. The web app how-old.net which went viral recently is using supervised learning. There is an interesting discussion on Quora about how how-old.net works. In the discussion, the Microsoft Bing Senior Program Manager, Eason Wang, also shared his blog post about this how-old.net project that he works on.

Gmail is also using supervised learning to find out which emails are spam or need to be prioritized. In the slides of Introduction to Apache Mahout, it uses YouTube Recommendation an example of supervised learning. This is because the recommendation given by YouTube has taken videos explicitly liked, added to favourites, rated by the user.

I love watching anime so YouTube recommended me some great anime videos. =P
I love watching anime so YouTube recommended me some great anime videos. =P

Unlike supervised learning, unsupervised learning is trying to find structure in unlabeled data. Clustering, as one of the unsupervised learning techniques, is grouping data into small groups based on similarity such that data in the same group are as similar as possible and data in different groups are as different as possible. An example for unsupervised learning is called the k-means Clustering.

Clearly, the prediction of Machine Learning is not about perfect accuracy.

Azure Machine Learning: Experiment!

With Azure Machine Learning, we are now able to perform cloud-based predictive analysis.

Azure Machine Learning is a service that developer can use to build predictive analytic models with training datasets. Those models then can be deployed for consumption as web service in C#, Python, and R. Hence, the process can be summarized as follows.

  1. Data Collection: Understanding the problem and collecting data
  2. Train: Training the model
  3. Analyze: Validating and tuning the data
  4. Deploy: Exposing the model to be consumed

Data Collection

Collecting data is part of the Experiment stage in Machine Learning. In case some of you wonder where to get large datasets, Doli shared with us a link to a discussion on Quora about where to find those public accessible large datasets.

In fact, there are quite a number of sample datasets available in Azure Machine Learning Studio too. During the presentation, Doli also showed us how to use Reader to connect to a MS SQL server to get data.

Get data either from sample dataset or from reader (database, Azure Blob Storage, data feed reader, etc.)
Get data either from sample dataset or from reader (database, Azure Blob Storage, data feed reader, etc.).

To see the data of the dataset, we can click on the output port at the bottom of the box and then select “Visualize”.

Visualize the dataset.
Visualize the dataset.

After getting the data, we need to do pre-processing, i.e. cleaning up the data. For example, we need to remove rows which have missing data.

In addition, we will choose relevant columns from the dataset (aka features in machine learning) which will help in the prediction. Choosing columns requires a few rounds of experiments before finding a good set of features to use for a predictive model.

Let's clean up the data and select only what we need.
Let’s clean up the data and select only what we need.

Train and Analyze

As mentioned earlier, Machine Learning learns from a dataset and apply it to new data. Hence, in order to evaluate an algorithm in Machine Learning, the data collected will be split into two sets, the Training Set for Machine Learning to train the algorithm and Testing Set for prediction.

Doli said that the more data we use to train the model the better. However, they are many people having different opinions. For example, there is one online discussion about the optimal ratio between the Training Set and Testing Set. Some said 3:2, some said 1:1, and some said 3:1. I don’t know much about Statistical Analysis, so I will just make it 1:1, as shown in the tutorial in Machine Learning Studio.

Randomly split the dataset into two halves: a training set and a testing set.
Randomly split the dataset into two halves: a training set and a testing set.

So, what “algorithm” are we talking about here? In Machine Learning Studio, there are many learning algorithms to choose from. I won’t go into details about which algo to choose here. =)

Choose learning algorithm and specify the prediction target.
Choose learning algorithm and specify the prediction target.

Finally, we just hit the “Run” button located at the command bar to train the model and make a prediction on the test dataset.

After the run is successfully completed, we can view the prediction results.

Visualize results.
Visualize results.

Deploy

From here, we can improve the model by changing the features, properties of algorithm, or even algorithm itself.

When we are satisfied with the model, we can publish it as a web service so that we can directly use it for new data in the future. Alternatively, you can also download an Excel workbook from the Machine Learning Studio which has macro added to compute the predicted values.

Read More and Join Our Meetup!

If you would like to find out more about Azure Machine Learning, there is a detailed step-by-step guide available on Microsoft Azure documentation about how to create an experiment in Machine Learning Studio. There is also a free e-book from Microsoft about Azure Machine Learning. Please take a look!

Oh ya, in case you would like to know more about how-old.net which is using Machine Learning, please visit the homepage of Microsoft Project Oxford to find out more about the Face APIs, Speech APIs, Computer Vision APIs, and other cools APIs that you can use.

Please correct me if you spot any mistake in my post because I am still very, very new to Machine Learning. Please join our meetup too, if you would like to know more about Azure.

Entertainment Hub Version 1

I received my first Raspberry Pi back in October, ten days after I ordered it online. After that I brought it back to my home in Kluang, Malaysia. The reason is that I would like to setup a home theatre with the help of Raspberry Pi. Hopefully in near future, I can have a complete entertainment hub setup for my family. Thus, I name this project, the Entertainment Hub.

Gunung Lambak, the highest point in Kluang.
Gunung Lambak, the highest point in Kluang.

Getting Raspberry Pi

Raspberry Pi is a credit-card-sized computer. According to the official website, it is designed to help the students to learn programming with lower cost. To understand more about Raspberry Pi, you can read its detailed FAQ. By saving S$1 per day, I easily got myself a new Raspberry Pi Model B (with 8GB SD card) after 2 months.

Entertainment Hub Project

Before the use of Raspberry Pi, I was using a Wireless 1080p Computer to HD Display Kit from IOGEAR to stream video from my laptop to the home TV. It requires a one-time installation of both the software and drivers on the laptop before I can use its wireless USB transmitter to connect between the PC and the wireless receiver which is connected to the TV with HDMI. Afer the installation, whenever I want to show the videos stored in external hard disk on the big screen, I always first need to switch on the receiver at TV side and then plug in the wireless USB transmitter on laptop. Now with the use of Raspberry Pi, I can easily browse the videos directly on the TV.

I only worked on the Entertainment Hub when I was at home. Also due to the fact that I only went back to home on Saturday and I would need to go back to Singapore on the following day, I didn’t really got much time to work on the project. Hence, I finally got video to show on the TV only after four times of travelling back to home.

Connecting External Hard Disk to Raspberry Pi

Before I started this project, I thought connecting an external hard disk directly to the Raspberry Pi would be enough. However, it turned out that it’s not the case. When I connected the external hard disk to the Raspberry Pi directly, the USB optical mouse, which was connected to another USB port of the Raspebrry Pi, lost its power. After doing some searches online, I found that it was most probably due to the fact that the Raspberry Pi didn’t have enough power to power up both the hard disk and the optical mouse at the same time.

The USB hard disk I have is 2.5” Portable HDD (Model: IM100-0500K) from Imation. After finding out that Raspebrry Pi had insufficient power for the portal hard disk, I chose to get a powered USB hub. Fortunately, there are nice people done a lot of tests on many, many USB hubs to find out which powered USB hubs are best to use together with Raspberry Pi. They posted a useful list of working powered USB hubs online for us to use as a guideline when choosing USB hub for Raspberry Pi. I bought the Hi-Speed USB 2.0 7-Port Hub by Belkin at Funan. Even though the model isn’t same as the one in the list, the USB hub works fine in my case.

To find out if Raspberry Pi can detect the portable hard disk or not, simply use the following command.

sudo blkid

If the external hard disk can be detected, then a similar results as follows should be printed.

/dev/sda1: LABEL=”HDD Name” UUID=”xxxxxxxxxxxxxxxxxx” TYPE=”ntfs”

Luckily my Raspberry Pi can auto detect the external hard disk and then can mount it automatically.

Entertainment Hub Version 1 Structure
Entertainment Hub Version 1 Structure

Enjoy Movies on Raspberry Pi

After successfully mounting the external hard disk on Raspberry Pi, I just need to browse to the folders on the hard disk to pick the video files and then play them using OMXPlayer, a video player pre-installed on Raspberry Pi. As I used HDMI cable to connect Raspberry Pi and TV, so by using the following command, both audio and video can be successfully transferred to the TV.

omxplayer -o hdmi -r video.flv

The reason of having -r here is to adjust video framerate and resolution. Without it, not only the video won’t be displayed in full screen on TV, but there also won’t be any audio from TV.

When I first used omxplayer, it showed a black screen after I closed the program. There are online documentation and solution about this issue as well. For me, after I rebooted the Raspberry Pi, the issue disappeared.

Watching Movie with Help of Raspberry Pi
Watching movie with the help of Raspberry Pi.

Dad’s Help in the Project

The case of my Raspberry Pi is designed and made by my Dad. I am very happy and thankful that my Dad helped making a case for my Raspberry Pi. Usually the case of Raspberry Pi is box-shaped. However, the case I have here is a cylinder. So my Raspberry Pi looks special. =)

A closer look of my Raspberry Pi.
A closer look of my Raspberry Pi.

Future Work

With this little success of having movies played on Raspberry Pi, the first part of the Entertainment Hub is done. Now, there are more things needed to be done in order to make it more user friendly and robust. First of all, there needs a playlist support. Secondly, the ability of replaying the videos. Thirdly, a better GUI to select videos, instead of just a command-line UI. All of these depend on how fast I learn to program an app in Raspberry Pi.

Let’s look forward to completion of this project.