Skip to main content

Supervised Learning



In supervised learning, we train the machine using data which is well labelled. It means some data is already tagged with the correct answer.

Types of Supervised Machine Learning Algorithms:

  • Regression (Linear and Multi-linear)
  • Logistic Regression
  • Classification
  • Naïve Bayes Classifiers
  • Decision Trees
  • Support Vector Machine (SVM)

Challenges in Supervised machine learning:

  • Irrelevant input feature present training data could give inaccurate or wrong results.
  • Data preparation and pre-processing is always a challenge.
  • Accuracy suffers when impossible, unlikely, and incomplete values have been inputted as training data.
  • If the concerned expert is not available, then the other approach is "brute-force." It means you need to think that the right features to train the machine on. It could be inaccurate.

Advantages of Supervised Learning:

  • Supervised learning allows you to collect data or produce a data output from the previous experience.
  • Helps you to optimize performance criteria using experience.
  • Supervised machine learning helps you to solve various types of real-world computation problems.

Disadvantages of Supervised Learning:

  • Decision boundary might be over-trained if your training set which doesn't have examples that you want to have in a class.
  • You need to select lots of good examples from each class while you are training the classifier.
  • Classifying big data can be a real challenge.
  • Training for supervised learning needs a lot of computation time.
Figure 1: Supervised Learning



Comments

Popular posts from this blog

Types of Machine Learning problems

In the previous blog, we had discussed brief about What is Machine Learning? In this blog, we are going to learn about the types of ML.  ML is broadly classified into four types: Supervised Learning Unsupervised Learning Semi-supervised Learning Reinforcement Learning 1. Supervised Learning Supervised learning is where there are input variables, say X and there are corresponding output variables, say Y. We use a particular algorithm to map a function from input(X) to output(Y). Mathematically, Y=f(X). Majority of the ML models use this type of learning to feed itself and learn. The goal of supervised learning is to approximate the said function so well that whenever we enter any new input, it's output is accurately predicted. Here, we can say that there is a teacher who guides the model if it generates incorrect results and hence, the machine will keep on learning until it performs to desired results. Supervised Learning can be further classified into: Classification : Here, the ou

Statistics in Data Science

Introduction Statistics is one of the popularly regarded disciplines this is particularly centered on records collection, records organization, records analysis, records interpretation and records visualization. Earlier, facts become practiced through statisticians, economists, enterprise proprietors to calculate and constitute applicable records of their field. Nowadays, facts have taken a pivotal position in diverse fields like records technology, system learning, records analyst position, enterprise intelligence analyst position, pc technology position, and plenty more. Statistics is a type of mathematical analysis that uses quantified models and representations to analyze a set of experimental data or real-world research. The fundamental benefit of statistics is that information is provided in an easy-to-understand style. Statistical & Non-Statistical Analysis Statistical analysis is used to better understand a wider population by analyzing data from a sample. Statistical analy

COVID19 Analysis using Power BI Desktop

Analysis of the data before running into predictions is very important. Understand a few rows and a few columns is very nominal task and we can easily examine the data. However, with a little larger data, suppose 10,000 rows with 50 columns, we really need to do analysis of the data so that we can come to know which factors are going to affect our prediction. Data Analysis with Python is a bit tedious task as we have to prepare the data i.e. cleaning, pre-processing and normalization. We use Seaborn and Matplotlib for our data visualization. But before plotting the graphs, we need to know which columns are inter-related. For that, we need a co-relation matrix which we can create using Python. However, PPS matrix is more better than co-relation matrix. Fig1: Co-relation Matrix of Covid19 dataset Fig2: PPS Matrix of Covid19 dataset It is always a tedious task when we code for Data Analysis. So, we have certain tools available in the market for it like Power BI, Tableau, etc. I have done