Join me on Facebook!
— Written by Triangles on March 12, 2016 • ID 32 —
What machine learning is about, types of learning and classification algorithms, introductory examples.
Linear regression with one variable — Finding the best-fitting straight line through points of a data set.
The gradient descent function — How to find the minimum of a function using an iterative algorithm.
The gradient descent in action — It's time to put together the gradient descent with the cost function, in order to churn out the final algorithm for linear regression.
Multivariate linear regression — How to upgrade a linear regression algorithm from one to many input variables.
How to optimize the gradient descent algorithm — A collection of practical tips and tricks to improve the gradient descent process and make it easier to understand.
Introduction to classification and logistic regression — Get your feet wet with another fundamental machine learning algorithm for binary classification.
The cost function in logistic regression — Preparing the logistic regression algorithm for the actual implementation.
The problem of overfitting in machine learning algorithms — Overfitting makes linear regression and logistic regression perform poorly. A technique called "regularization" aims to fix the problem for good.
This is the first article on my series of machine learning notes, a sub-field of Artificial Intelligence that arouses me since some time. The main source of knowledge will be the Machine Learning course @ Coursera, provided by Andrew Ng from Stanford University, along with other books and online tutorials.
Arthur Lee Samuel had a nice definition of machine learning: the field of study that gives computers the ability to learn without being explicitly programmed. I also like what Drew Conway in his book Machine Learning for Hackers says about machine learning: it's just statistics made by computers.
In general, machine learning is used to make predictions on data. But instead of hard-code those predictions with a custom algorithm, you let the program itself to figure out the best output, based on the input data. This is also called data-driven prediction/decision. Sounds like magic, doesn't it? Let me show off a couple of examples and everything will be more clear.
Machine learning tasks are typically classified into several broad categories, depending on what side of your program you are looking at. If you think of the output side of machine learning, that is the outcome of your program, the most famous learning operations are regression and classification.
So you have a bunch of data and you want to make a prediction on it. For example you are collecting real-estate information in your city, because you want to predict the house prices given, say, its size in feet. You start gathering data and you end up with something like in the picture 1.:
Each dot on the graph is a survey. For example you found out that a 1000 square feet house is worth about $200.000 (are those fantasy numbers? I don't know, sorry). You also found out that a ~1300 square feet house is worth ~$250.000. Machine learning will help you answer questions like: how much a 1100 square feet house is worth, given my input data?
In this case the output of your machine learning algorithm takes continuous values, i.e. any number from $0 to $400.000. This is a regression problem. The weird name comes from the fact that you "regress" your data to a line (the dotted one in the picture 1.), with a corresponding mathematical equation. If you know the equation, you can find any output (y) given any input (x). The operation is called linear regression and I will deal with it in future chapters.
Let's now change topic: you want to know if a watermelon is more or less sweet given its size. As always you start collecting data and you finally end up with a chart like the following one (picture 2.):
Each dots is a survey, where full dots are sweeter watermelons. Fantasy numbers here, too. Then, given a new watermelon of, say 41 centimeters of diameter, you want to know whether its flavor will be more or less sweet.
In this case the output of your machine learning algorithm takes discrete values: more sweet or less sweet. You are basically classifying things, like putting lables on each outcome, and that's called classification.
The vertical dotted line is the hyperplane, a boundary generated by the algorithm, used to discern values. Your program decided that values below ~33 cm are classified as "more sweet" and viceversa. More on that in future chapters, of course.
I've talked about the output side of a machine learning program so far. When you think of the input side instead, that is the data you feed into it, two broad learning categories come up: supervised and unsupervised learning.
In supervised learning you give the algorithm the right answer in advance. For example, let's take a look back at the house pricing dataset in figure 1. There, for every point (size in square feet) I told the program the right price. The algorithm just had to produce more of those right answers.
The watermelon example in figure 2. was a supervised learning task as well. I told the program the sweetness of each watermelon in advance, and it just had to interpolate new outputs given new watermelon sizes in input.
Unsupervised learning introduces yet more black magic on the scene: you let the algorithm figure out the labels itself. This approach brings in the concept of clustering: the task of grouping objects so that the same group (called a cluster) are more similar to each other than to those in other groups.
Unsupervised learning is great when you don't know how to label things in advance. For example, let's think of an image classification problem. You have a bunch of pictures you want to classify based on what they portray. You don't provide the algorithm with the right labels, maybe because even you don't know what the pictures are about. The task of the machine learning program is to find out similarities in the input data and figure out itself the best way to classify the pictures into proper groups, or clusters.