logo

Histograms of Quantitative Data 📂Data Science

Histograms of Quantitative Data

Definitions 1 2

Complex Definition

A bar chart made from the frequency distribution of quantitative data is called a histogram.

Simple Definition

A histogram is a bar chart where numerical data is divided into intervals, and the frequency of data within those intervals is counted, with the sizes represented as the heights of the bars.

Explanation

Histograms are an indispensable visualization technique in scientific literature, especially used to represent probability distributions when data involves uncertainty. While bar charts are widely accessible to the general public, histograms often require at least a minimum of additional explanation, as introduced in the simple definition.

20220611_135801.png

In the screenshot above, column A represents a histogram made from given quantitative data. Unlike typical bar charts that separate bars to distinctly categorize them, histograms usually minimize gaps between bars to function as visual representations of probability distributions.

Note: Bin

In histograms, the size of an interval is called a bin. Being quantitative data, it’s a concept similar to the size of a class interval, and how bins are determined can significantly affect the appearance of the histogram. If the number of class intervals is reduced from five to two with the same data, the result is as shown below.

20220611_135930.png

Although only 13 samples are involved, reducing the number of class intervals drastically simplifies the distribution too much, failing to represent the original probability distribution. Relying on intuition to quickly grasp probability distributions is important, but being overly confident can lead to falling into the trap of subjectivity.

See also


  1. Mendenhall. (2012). Introduction to Probability and Statistics (13th Edition): p25, 165. ↩︎

  2. 경북대학교 통계학과. (2008). 엑셀을 이용한 통계학: p29. ↩︎