# How to evaluate your Machine learning model like a pro

**Updated on:**March 22, 2020 · 13 mins read

1. What is the process of evaluating your model?

2. Why do we need to evaluate our model?

3. Supervised learning and classification problems.

4. Accuracy

5. Recall

6. Precision

7. F1-score

8. Confusion matrix

9. Evaluating for regression problems

10. Mean Absolute Error

11. Mean Square Error

12. Root Mean Square Error

13. What is considered as a good metric value for your model?

You are happy with your Machine Learning model and excited to share about it with your client. You go up to your clients and show them the new model. You are excited to run it on the new production data. Checking your eagerness, client asks this question.
2. Why do we need to evaluate our model?

3. Supervised learning and classification problems.

4. Accuracy

5. Recall

6. Precision

7. F1-score

8. Confusion matrix

9. Evaluating for regression problems

10. Mean Absolute Error

11. Mean Square Error

12. Root Mean Square Error

13. What is considered as a good metric value for your model?

How accurate is your model?What should be your answer to this question? Today we are going to talk about the problem of calculating the accuracy of your model along with some code samples that will help you to calculate them easily. In the end of the post, you will be able to know and understand all the ways in of evaluating your machine learning model.

## What is the process of evaluating your model?

So, Machine Learning is a simple way of predicting the results with the input that model has not seen before. For example, Predicting stock prices with the historical data related to that particular stock which can tell us, whether it would be profitable to buy a stock on particular day or not. Evaluating a model is like checking the accuracy of model when test data is passed onto the model - a piece of data which the model has never seen before.## Why do we need to evaluate our model?

In general, it is a good metric to come up with when we are talking about your model with the guys in other teams who don’t understand tech. No model is 100% correct and there is no perfect score you want to achieve. It simply depends on case to case basis. Also, sometimes giving wrong answer is better than giving very wrong answer. For example if you are building a cancer detector depending upon the test reports of the patient. You might want to tell that the patient might have`cancer`

and finally get off with the real cancer test, rather than giving `no cancer`

output from your model, when in reality the person had cancer.
## Supervised learning and classification problems.

`Supervised Learning`

are the problems where the outcomes of the model are already known. For example a data set of housing prices of an area.
`Classification Problem`

are a subset of supervised learning where the outcomes are generally divided into two or more parts. For example whether a person is having cancer or not.
All these models can be evaluated on the following parameters.
### Test train split using sklearn

Generally we divide the total dataset into two parts. First dataset is known as the training data and the other is known as the test data. The idea behind such division of the data is to use test data just for evaluation purposes. To find if the model we are trying to use for the given dataset is good enough, or we want to use a different one. Here is a code sample which can help you to divide your data frame into several parts using scikit-learn.```
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv('Ecommerce Customers')
X = df[['Avg. Session Length', 'Time on App', 'Time on Website', 'Length of Membership']]
y = df['Yearly Amount Spent']
df = pd.read_csv('Ecommerce Customers')
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
```

`test_size`

is the amount of data that you want to your test split to be. `0.3`

means that the test split would be 30%.
## Accuracy

Accuracy is the simple calculation where you divide number of data points evaluated correctly by the number of total data points.
$accuracy = \frac{number\ of\ correct\ responses}{Total\ number\ of\ test\ cases}$

We first calculate the `predictions`

corresponding to the given `X_test`

, finally we compare these predictions with the `y_test`

which are the real outcomes to the corresponding parameters.
We will talk more about how we calculate these values in later posts. Keep track of the progress by subscribing.
Accuracy is one of the easiest way to evaluate the performance of your model.
**Note:**This type of evaluation model is not the best thing to use when the data available to you is unbalanced.

Unbalanced data is the type of dataset in which you have more outcomes for one type of the data and less outcomes for others.For example In a cat-dog images dataset, out of 100 pictures, you have 90 pictures of dog and 10 pictures of cat. In such case, if you are checking the accuracy for dog pictures and your model always return

`dog`

, no matter what picture is thrown at it, the accuracy in any case will be `90%`

.
## Recall

It is the ability of your model to find all the relevant cases in your model.Most of these evaluation models are used when the data in mainly imbalanced.

$recall = \frac{number\ of\ true\ positives}{number\ of\ true\ positives + No.\ of\ false\ Negatives}$

In a case of a model which classify every volcano to erupt the next day having a data set with 20% values of it erupting the next day, The value of recall will look something like this.
$recall = \frac{volcanos\ correctly\ identified}{volcanos\ correctly\ identified + volcanos\ incorrectly\ labelled\ to\ not\ erupt\ tomorrow}$

$recall = \frac{20}{20 + 0}$

$recall = 1$

Although the value of `recall`

is great, it goes with the value of `precision`

and in most cases both of their value is considered.
## Precision

Ability of a model to identify only the relevant data points.

$precision = \frac{number\ of\ true\ positives}{number\ of\ true\ positives + No.\ of\ false\ Positives}$

According to the same volcano problem,
$precision = \frac{volcanos\ correctly\ identified}{volcanos\ correctly\ identified + volcanos\ incorrectly\ labelled\ to\ erupt\ tomorrow}$

$precision = \frac{20}{20 + 80}$

$precision = 0.2$

In general, it is a standard to maximize both the values of recall and precision.
## F1-score

It is the harmonic mean of precision and recall.F1-score helps us to consider both the values of precision and recall while evaluating our model.

$F1-score = \frac{2 * precision * recall}{precision * recall}$

## Confusion matrix

Confusion matrix is a Matrix in which we evaluate all the positives and negatives like:- True Positive
- False Positive
- True Negative
- False Negative

`confusion matrix`

for a case where patient’s data was tested for a specific disease. Out of total 165 patients our model produced the following results.
**True Positives:**Our model predicted that 100 patients were carrying the disease, and they were actually carrying the disease.

**True Negatives:**Our model predicted that 50 patients were not carrying the disease, and they actually were not carrying it.

**False Positives:**Our model predicted that 10 patients were carrying the disease, and they actually were not carrying it. This is also known as

**Type-I error**.

**False Negatives:**Our model predicted that 5 patients were not carrying the disease, and they actually were carrying it. This is also known as

**Type-II error**. We can go forward and calculate all the values for Accuracy, Recall, Precision and F1-Score from this confusion matrix. To print confusion matrix of a model in

`sklearn`

use the following code.
```
from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_test, predictions))
# where y_test is the data frame of test values
# and predictions are the model predicted values
```

`recall`

, `precision`

etc. using the following code,
```
from sklearn.metrics import classification_report
print(classification_report(y_test, predictions))
```

### Introduction to KNN | K-nearest neighbour algorithm using Examples

#python
#sklearn
#knn
#machinelearning

March 22, 2020
5 mins read

## Evaluating for regression problems

Regression problems are a little different from the categorization problems as the output is just not a single value. Here you can actually see how off your predicted value was from the real value. Here are the three ways in which regression models can be evaluated.## Mean Absolute Error

Its a simple difference between the predicted and the actual values. Here is the simplest way in which you can calculate Mean Absolute error for your model.
$MAE = \frac{1}{n}\sum\left | Y_a - Y_e \right |$

$where\ Y_a\ is\ the\ actual\ value$

$and\ Y_e\ is\ the\ model\ evaluated\ value$

You can get the value in using `sklearn`

using the following code.
```
from sklearn import metrics
metrics.mean_absolute_error(y_test, predictions)
```

## Mean Square Error

Mean square error is the sum of squared errors of the predicted and actual values.
$MSE = \frac{1}{n}\sum (Y_a - Y_e)^{2}$

You can get the value in using `sklearn`

using the following code.
```
from sklearn import metrics
metrics.mean_squared_error(y_test, predictions)
```

`100 square unit`

accurate.
## Root Mean Square Error

To solve the problem with the`MSE`

, we use `RMSE`

just by square rooting the values of `MSE`

. We can then simply tell about the proficiency of our model from the value derived from this.
$RMSE = \sqrt{\frac{1}{n}\sum (Y_a - Y_e)^{2}}$

You can calculate this is Python using Numpy and `sklearn`

.
```
from sklearn import metrics
np.sqrt(metrics.mean_squared_error(y_test, predictions))
```

## What is considered as a good metric value for your model?

As we have already disscussed a good metric value of model really depend upon the type model you are evaluating. You always have check with the peers who really are going to use your model. Hope you liked the post, do leave your thoughts on what you use to evaluate your model and what do you think is a good metric value for your model.**Please share your Feedback:**

Did you enjoy reading or think it can be improved? Don’t forget to leave your thoughts in the comments section below! If you liked this article, please share it with your friends, and read a few more!