TASK-5

Mangal Hansdah
6 min readJun 6, 2021

SUMMER-TRAINING-2021

Task description:-

confusion matrix — its two types of error and about cyber crime cases

Don’t know what is cyber crime and confusion matrix? Don’t know what to understand from it? Is Confusion matrix making you confused?

Then this article is cooked and served just for you.

What is Cybercrime?

Cybercrime, also called computer crime, is a crime that involves a computer and a network. The computer may have been used in the commission of a crime, or it may be the target. Cybercrime may harm someone’s security and financial health.

What is Cybersecurity?

Cybercrime, also called computer crime, the use of a computer as an instrument to further illegal ends, such as committing fraud, trafficking intellectual property, stealing identities, or violating privacy.

What is Cyberattack?

A cyber attack is an assault launched by cybercriminals using one or more computers against a single or multiple computers or networks. A cyber attack can maliciously disable computers, steal data, or use a breached computer as a launch point for other attack.

How Machine Learning is helping in reducing the cyberattack risks?

As cyberattacks grow in volume and complexity, machine learning is helping under-resourced security teams stay ahead of threats and vulnerabilities.

Machine learning based cybersecurity systems are being taught to detect viruses and malware by using complex algorithms so they can then run pattern recognition in software. They can learn how to identify even the smallest behaviors of ransomware and malware attacks before they enter your system. And they can use predictive functions that help you get ahead of the curve, far surpassing the speed and accuracy of traditional cybersecurity approaches.

Various Machine Learning Algorithm used Cybersecurity to prevent from attacks

  1. Isolation Forest
  2. Histogram-based outlier detection

3. Cluster-based local outlier detection

4. Angle-based outlier detection

When we train machine model there is always chances of errors as accuracy reaches b/w 80–90%, because our model can’t predict everything accurate.

confusion matrix:-

A confusion matrix is a summary of prediction results on a classification problem. The number of correct and incorrect predictions are summarized with count values and broken down by each class. This is the key to the confusion matrix. is confused when it makes predictions.

What is confusion matrix?

In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each row of the matrix represents the instances in an actual class while each column represents the instances in a predicted class, or vice versa — both variants are found in the literature. The name stems from the fact that it makes it easy to see whether the system is confusing two classes (i.e. commonly mislabeling one as another).

It is a special kind of contingency table, with two dimensions (“actual” and “predicted”), and identical sets of “classes” in both dimensions (each combination of dimension and class is a variable in the contingency table).

Structure of the Confusion Matrix:-

The size of the matrix is directly proportional to the number of output classes. It is a square matrix where we assume the column headers as actual values and the row headers as model predictions. The values which are true and predicted true by the model are True Positives (TP), correct negative value predictions are True Negatives (TN), values which were negative but predicted as true are False Positives (FP) and positive values predicted as negative are False Negatives (FN).

Let’s Have a look in the image below :-

  • True Positive — You predicted positive and it’s true. You predicted that women is pregnant and it is actually true.
  • True Negative — You predicted negative and it’s true. You predicted that old man is not pregnant and it is actually true(it’s a old man).
  • False Positive (Type 1 Error) — You predicted positive and it’s false. You predicted that old man is pregnant but it is actually false(it’s a old man).
  • False Negative (Type 2 Error) — You predicted negative and it’s false. You predicted that women is not pregnant but it is actually false.

TWO TYPES OF ERROR OF CONFUSION MATRIX :

Confusion matrices have two types of errors: Type I Error and Type II Error .

  • False Positive (caused by Type I Error):

How often the model predicts ‘yes’ for an actual ‘no’.

The first way is to re-write False Negative and False Positive. False Positive is a Type I error because False Positive = False True and that only has one F. False Negative is a Type II error because False Negative = False False so thus there are two F’s making it a Type II. (Kudos to Riley Dallas for this method!)

For example , in a scenario where a person not suffering from cancer is diagnosed to have cancer. This can be really dangerous and sometimes fatal due to the high doses of radiation and chemotherapy that a patient can be exposed to.

  • False Negative (caused by Type II Error):

The second way is to consider the meanings of these words. False Positive contains one negative word (False) so it’s a Type I error. False Negative has two negative words (False + Negative) so it’s a Type II error.

For example, if a cancer patient is wrongly diagnosed as not having cancer, that individual would either go undiagnosed or misdiagnosed. Similarly, identifying a fraudulent transaction as non-fraudulent can cause several serious repercussions for a bank. Hence, whenever we intend our model to be a diagnostic aid, we would always want this metric to be as low as possible.

CYBER CRIME CASES AND CONFUSION MATRIX:

If you are in the Cybersecurity teams and if the model predicted that an attack is going to happen you will be active and take precautions accordingly and if in future attack happens, then we can relate it with True Negative. And if attack won’t happen then we can relate it with False Negative. Now if model predicted that no attack will happen and in that duration no attack happened we can then relate it with True Positive. Now the last case of False Positive, if model predicted that no attack is going to happen but this time cyberattack happened, this is the only case where our company or team will suffer. As at this time we were not predicting any attacks. This is a major problem that we are facing while we try to predict the attacks by Machine learning models.

Conclusion

A confusion matrix is a remarkable approach for evaluating a classification model. It provides accurate insight into how correctly the model has classified the classes depending upon the data fed or how the classes are misclassified.

Thank you for reading

Still, have a query, feel free to ask in the comment box.

Please let me know if you have any feedback… If you like this article then please share this post with your fellow tech enthusiast.

--

--