Learn in this article about Machine Learning and the classification of algorithms, written by one of our software engineering experts from the HAT.
Algorithms’ classification predicts the class or category for a single instance of data. For example, email filters use binary classification to determine if an email is spam. There are two forms of classification tasks. The first is binary classification, where the goal is to predict one of two outcomes.
The other is multiclass classification, where the goal is to predict one of many outcomes. The output of a classification algorithm is called a classifier, which can be used to predict the label of a new (unlabeled) instance.
This is supervised learning algorithms make predictions based on a set of examples. For instance, historical stock prices can be used to hazard guesses at future prices. Each example used for training is labeled with the value of interest.
Let’s start with the question: Is this A or B?
This family of algorithms is called a two-class classification.
It’s useful for any question that has just two possible answers. They are several algorithms for use for this question. This next image represents a two-classes support vector machine, one of the most popular used.
Is this A or B or C or D, etc.?
This is called multiclass classification and it’s useful when you have several—or several thousand—possible answers. Multiclass classification chooses the most likely one.
The next image represents a one vs. all classification.
AzureML algorithms for Classification
The category Initialize Classification Model includes the following modules:
To see the complete documentation of each one go here!
1- Selection of data set
We use the Adult Census Incoming Binary classification data set.
The column income is the label.
2- Using select column and Split data
We select the columns that we know will be more useful for the prediction, and then we split the data for training and score the model.
3- Using the Two-class Boosted Decision
We drop the classification model into the canvas and leave the default parameter values.
4- Score and evaluate model
Run the experiment to check the score and evaluate model results.
The right two columns, Scored Labels and Scored Probabilities are the prediction results. The Scored Probabilities column shows the probability that the predicted class belongs to the positive one (in this case “> 50K”).
To see more documentation for interpreting models results, see here.
5- Finished experiment
The image below represents the entire experiment ready to run.
Comments? Contact us for more information. We’ll quickly get back to you with the information you need.