Introduction
Statistics is one of the popularly regarded disciplines this is particularly centered on records collection, records organization, records analysis, records interpretation and records visualization. Earlier, facts become practiced through statisticians, economists, enterprise proprietors to calculate and constitute applicable records of their field. Nowadays, facts have taken a pivotal position in diverse fields like records technology, system learning, records analyst position, enterprise intelligence analyst position, pc technology position, and plenty more.
Statistics is a type of mathematical analysis that uses quantified models and representations to analyze a set of experimental data or real-world research. The fundamental benefit of statistics is that information is provided in an easy-to-understand style.
Statistical & Non-Statistical Analysis
Statistical analysis is used to better understand a wider population by analyzing data from a sample. Statistical analysis enables inferences about target markets, consumer cohorts, and the general population to be established by suitably expanding data to forecast the behavior and attributes of the many based on the few. Data is employed in statistical analysis because it may be combined from multiple sources to aid in the statistical analysis process.
Non-statistical sampling refers to the selection of a test group based on the examiner's judgment rather than a formal statistical procedure. An examiner, for example, could use his or her own discretion to determine one or more of the following:
- The number of samples
- Items are chosen for the test group
- How are the outcomes assessed?
Statistical Analysis is also called Quantitative Analysis while Non-Statistical Analysis is termed Qualitative Analysis.
Categories of Statistics
Descriptive statistics is crucial before the process of inferential.
There are two types of statistics:
- Descriptive statistics:
- Present, organize, summarize, and describe the collected data using the measures discussed throughout measures of center, measures of spread, the shape of our distribution, and outliers.
- We can also use plots of our data to gain a better understanding.
- Helps us organize data and focus on the main characteristics of the data.
- Provides a summary of data numerically or graphically.
- Inferential statistics:
- This is where you run different tests and draw conclusions about your sample that we can impute to a larger population.
- Performing inferential statistics well requires that we take a sample that accurately represents our population of interest.
- It generalizes the larger dataset and applied probability theory to draw conclusions.
- It allows us to infer population parameters based on sample statistics and to model relationships within the data.
- Note: Modelling allows us to develop mathematical equations which describe the inter-relationships among two or more variables.
Statistical Terminologies
Statistics have been used in many sectors:
- Insurance
- Stock Market
- Genetics
- Medical Studies
- Shopping
- Weather Forecasting
There are various statistical terms that one should be aware of while dealing with statistics. Some of them are:
- Population: A group from which data is to be collected.
- Sample: A sample is a subset of the population.
- Variable: A variable is a feature characteristic of any member of a population differing in quantity/quality from another member.
- Quantitative Variable: A variable differing in quantity. Example: weight of a person, number of people in a car, etc.
- Qualitative Variable: A variable differing in quality. Example: the color of the car, degree of damage to a car in an accident, etc
- Discrete Variable: A discrete variable is the one in which no value can be assumed between two given variables. Example: Number of children in a family
- Continuous Variable: A continuous variable is the one in which any value can be assumed between two given variables. Example: The time taken for a 100m race.
Types of Statistical Measure
There are four types of statistical measures used to describe data:
- Measure of Frequency:
- The frequency of the data indicates the number of occurrences of any particular data value in the given dataset.
- The measure of frequency is number and percentage.
- Measure of Central Tendency:
- Central tendency indicates whether the data values accumulate in the middle of distribution or towards the end.
- The measures of central tendency are Mean, Median, Mode.
- Measure of Spread:
- Spread describes how similar or varied the set of observed values are for a particular variable.
- The measures of spread are Standard Deviation, Variance, and Quartiles.
- The measure of spread is also called the measure of dispersion.
- Measure of Position:
- The position identifies the exact location of a particular data value in the given dataset.
- The measure of position are Percentiles, Quartiles, and Standard Scores.
Hypothesis Testing
Hypothesis Testing is an inferential statistical technique to determine whether there is enough evidence in the data sample to infer that a certain condition holds true for the entire population.
Steps:
- Take a random sample.
- Analyze the properties of the sample.
- Test whether or not the identified conclusions correctly represent the population.
- A hypothesis is generated about a population parameter.
Two types of hypothesis:
- Null Hypothesis (H0):
- The null hypothesis is assumed to be true unless there is strong evidence to the contrary.
- No variation exists between the variables.
- Example:
- A pharmaceutical company has introduced a medicine in the market for a particular disease and people have been using it for a considerable period of time, and it is generally considered safe.
- If the medicine is proved to be safe then it is referred to as a null hypothesis.
- Alternative Hypothesis (H1):
- The alternative hypothesis is assumed to be true when the null hypothesis is proved false.
- Example:
- In the above example, we should prove that the medicine is unsafe to reject the null hypothesis. If the null hypothesis is rejected, then an alternative hypothesis is used.
Variable Types
Based on the nature of variables, variables are classified into four types:
- Nominal variables:
- This has two or more categories and it is important to order the values.
- Example: Gender and Blood Group
- Ordinal variables:
- This has values in a logical order. However, the relative distance between the two data values is not clear.
- Example: Size of a coffee cup (S/M/L), ratings of the product (Good/Avg/Bad)
- Interval variables:
- With an interval scale, equal differences between scale values do not have equal quantitative meaning.
- An interval scale provides more quantitative information than an ordinal scale. And interval scale does not have a true zero point.
- Example: The Fahrenheit degree scale used to measure temperature, the distance between two compartments in a train.
- Ratio variables:
- Ratio scales are similar to interval scales in that equal differences between scale values have equal quantitative meaning.
- It has a true zero point.
- Example: The system of inches is used with a common ruler.
Hypothesis Testing Procedure
There are two Hypothesis Testing Procedures:
- Parametric Tests:
- Traditional tests such as t-test or ANOVA are called parametric tests. They depend on the specification of a probability distribution except for a set of free parameters.
- If the population information is known completely by its parameter, then it is a parametric test.
- Non-parametric Tests:
- If the population or parameter information is not known and still required to test the hypothesis of the population, then it is a non-parametric test.
- They do not require any strict distributional assumptions.
Nice explanation 👌
ReplyDeleteGlad to know!
DeleteGreat! Thanks for sharing :)
ReplyDeleteGlad to know!
DeleteVery Informative and creative contents. This concept is a good way to enhance knowledge. Thanks for sharing. Continue to share your knowledge through articles like these.
ReplyDeleteData Engineering Services
Data Analytics Services
Artificial Intelligence Services
Data Modernization Services