Experience is a great teacher. And in my experience, the beginning has always been tough.

As a BI consultant & trainer, I have worked with various people: from the frontline employees to CEOs. Our interactions gave me great insights into the multiple challenges they face.

I have put together a few pages from my experience and learning; covering Power BI, Excel, Power Query, Power Pivot, Tableau, data analytics & data visualization.

The purpose of the blog is to get you started.

I hope you’ll find them helpful.

Vivek

Statistics Simplified is the series to express statistics in layman terms.


A single value that attempts to describe a set of data by identifying the central position, within that set of data.


One point in the data set which balances the entire data set.


Central Tendency is also known as Measure of Central Location or more accessible, average.

Measures of Central Tendency

There are three measures of central tendency: Mean, Median, and Mode

The most popular measure of central tendency is Arithmetic Mean, which is also represented with the formula AVERAGE in Excel.


Depending on the data type, we use an appropriate measure of central tendency


Also see:

Variation

Mean vs. Median

Trim Mean

Geometric Mean



#vivran

Statistics Simplified is the series to express statistics in layman terms.


Identifying data types is crucial in data analytics. Wisdom says that we should know the data type before we start the data analysis process. And the reasons are apparent. If we understand the data type, then we can apply appropriate mathematical aggregations and statistical tests.

We categorize data into two primary categories: Qualitative & Quantitative

We can understand data types by the following example:

Discrete data types primarily contain count and percentages.

The fundamental difference between a continuous and a discrete data type is that continuous data type is always associated with a unit or a scale, e.g., kilogram, meter, centimeter, degree Celsius, years.

Each data type has its level of measurement:


And depending on the data type, we can decide on the underlying mathematical aggregations:


Also see: Central Tendency & Variation



#vivran




How do we decide if it is a perfectly cooked rice?




We randomly pick one grain of rice and check. Based on our findings on the single grain, we infer that the entire rice bowl is perfectly cooked or not.




Sampling is a process of understanding the behavior of the entire group by learning the behavior of a portion of the group.


The single grain of rice in the above example was a sample, and the process of picking the grain is known as Sampling.




Why do we sample?


The primary reason is that it is easy to collect data for a sample than the entire population.


Data collected using various sampling techniques are practical, economical, handy, and adaptable.


Types of Sampling Techniques


There are two popular sampling methods:

  • Probability Sampling

  • Non-Probability Sampling.


In probability sampling, every member of the population has a chance of getting selected as a sample. We use this technique when we want our sample to be a representation of the entire population. We use this for quantitative analysis.


In the non-probability sample, samples are selected based on specific criteria. Not every individual has a chance of being selected. This technique is applicable in research and qualitative analysis. It helps to get a basic understanding of a small group or population under a specific study.



This article explains the popular methods used in probability sampling.


Simple Random Sample


We randomly pick samples from the entire population. Every member has an equal opportunity of getting selected.



For example, for conducting an employee-based survey, we randomly picked employees within an organization.


Systematic Sample


We randomly pick the first sample, and then after that, we choose every nth item in the data.

In the example below, we pick every fourth element from the sample after choosing the second item.


We arranged the employees by the employee ID for the same survey and randomly picked an employee (Emp ID 2). Then we select every 4th employee after that (Emp ID 6, 10, 14…).


Stratified Sample


Strata mean layers. We pick members for Sampling from each layer. For example, we have four groups, and we ensure we select at least one from each group.


This time for the survey, we arranged data based on employee’s designation (Associate, SME, TL, Manager…). Then we randomly pick employees from each designation group.


Cluster Sample


We divide the population into different clusters with similar characteristics. Then, we randomly pick the entire group.


For conducting the survey, we are picking all the employees within a department or team.

Drop Me a Line, Let Me Know What You Think

 

contact@vivran.in

+91 9871-641-146

Join WhatsApp group: BI Simplified

MS EXCEL || POWER BI || POWER APPS || POWER QUERY || TABLEAU || DATA ANALYTICS || TRAINING