Descriptive vs Inferential Statistics
Statistics is the discipline of collection, analysis, and presentation of data. Theory of statistics is divided into two branches on the basis of the information they produce by analyzing the data.
What is Descriptive Statistics?
Descriptive statistics is the branch of statistics that describe the main properties of a data set quantitatively. To represent the properties of a data set as accurately as possible, the data are summarized using either graphical or numerical tools.
The graphical summarization is done by tabulating, grouping, and graphing the values of the variables of interest. Frequency distribution and relative frequency distribution histograms are such representations. They portray the distribution of the values throughout the population.
The numerical summarization involves computing descriptive measures such as average, mode, and mean. The descriptive measures are further categorized into two classes; they are measures of central tendency and measures of dispersion/variation. The measures of central tendency are the mean/average, median, and mode. Each has its own level of applicability and usefulness. Where one may fail, the other may represent the data set better.
As the name implies, measures of dispersion involve measuring the distribution of the data. The range, standard deviation, variance, percentiles and quartile ranges, and coefficient of variation are measures of dispersion. They provide information about the spread of the data.
A simple example of the use of descriptive statistics is calculating Grade Point Average of a student. The GPA in essence is the weighted mean of the students’ results and is a reflection of the overall academic performance of that particular student.
What is Inferential Statistics?
Inferential statistics is the branch of statistics, which derive conclusions about the concerned population from the data set obtained from a sample subjected to random, observational, and sampling variations. In general, results are obtained from a random sample of the population and the conclusions derived from the sample are then generalized to represent the whole population.
The sample is a subset of the population, and measures of descriptive statistics for the data acquired from the sample are simply known as statistics. The measures of descriptive statistics obtained from the analysis of the sample are known as parameters when applied to the population, and they represent the whole population.
Inferential statistics focus on how to generalize the statistics obtained from a sample as accurately as possible to represent the population. One factor of concern is the nature of the sample. If the sample is biased, then the results are also biased, and the parameters based on these do not represent the whole population correctly. Therefore, sampling is one important study of inferential statistics. Statistical assumptions, Statistical decision theory, and estimation theory, hypothesis testing, design of experiments, analysis of variance, and analysis of regression are prominent topics of study in the theory of inferential statistics.
A good example of inferential statistics in action is the prediction of the results of an election prior to the voting by means of polling.
What is the difference between Descriptive and Inferential Statistics?
• Descriptive statistics is focused on summarizing the data collected from a sample. The technique produces measures of central tendency and dispersion which represent how the values of the variables are concentrated and dispersed.
• Inferential statistics generalizes the statistics obtained from a sample to the general population to which the sample belongs. The measures of the population are termed as parameters.
• Descriptive statistics make only summarization of the properties of the sample from which data were acquired, but in inferential statistics, the measure from the sample is used to infer properties of the population.
• In inferential statistics, the parameters were obtained from a sample, but not the whole population; therefore, always some uncertainty exists compared to the real values.