Skip to main content

Posts

Showing posts from September, 2020

COVID19 Analysis using Power BI Desktop

Analysis of the data before running into predictions is very important. Understand a few rows and a few columns is very nominal task and we can easily examine the data. However, with a little larger data, suppose 10,000 rows with 50 columns, we really need to do analysis of the data so that we can come to know which factors are going to affect our prediction. Data Analysis with Python is a bit tedious task as we have to prepare the data i.e. cleaning, pre-processing and normalization. We use Seaborn and Matplotlib for our data visualization. But before plotting the graphs, we need to know which columns are inter-related. For that, we need a co-relation matrix which we can create using Python. However, PPS matrix is more better than co-relation matrix. Fig1: Co-relation Matrix of Covid19 dataset Fig2: PPS Matrix of Covid19 dataset It is always a tedious task when we code for Data Analysis. So, we have certain tools available in the market for it like Power BI, Tableau, etc. I have done

Supervised Learning

In supervised learning, we train the machine using data which is well labelled.   It means some data is already tagged with the correct answer. Types of Supervised Machine Learning Algorithms: Regression (Linear and Multi-linear) Logistic Regression Classification Naïve Bayes Classifiers Decision Trees Support Vector Machine (SVM) Challenges in Supervised machine learning: Irrelevant input feature present training data could give inaccurate or wrong results. Data preparation and pre-processing is always a challenge. Accuracy suffers when impossible, unlikely, and incomplete values have been inputted as training data. If the concerned expert is not available, then the other approach is "brute-force." It means you need to think that the right features to train the machine on. It could be inaccurate. Advantages of Supervised Learning: Supervised learning allows you to collect data or produce a data output from the previous experience. Helps you to optimize performance criteria u