Most Commonly Used Numbers: Statistics, Psycholinguistics, and Benford's Law
Introduction: The Number as a Unit of Information and a Cultural Marker
The question of the frequency of numbers seems simple, but its analysis lies at the intersection of mathematical statistics, perception psychology, linguistics, and information theory. It is important to distinguish between the natural frequency of occurrence of numbers in numerical data of the real world and their subjective frequency in human practice (in numbers, prices, elections). Most surprisingly, these distributions are not random or uniform, but follow deep regularities important for data analysis, fraud detection, and understanding cognitive biases.
1. Benford's Law: Unexpected Asymmetry in the World of Numbers
The most powerful and counter-intuitive fact about the frequency of numbers is described by Benford's Law (the law of the first digit). It states that in many natural sets of numerical data (from electricity bills and mountain heights to molecular weights and stock market quotations), the probability that the first significant digit (from 1 to 9) will be equal to d is calculated by the formula: P(d) = log₁₀(1 + 1/d).
This gives the following distribution of probabilities for the first digit:
1 appears approximately in 30.1% of cases.
2 — about 17.6%.
3 — about 12.5%.
The frequency then decreases: 9 occurs only in 4.6% of cases.
Reason: The law works for data that are distributed over many orders of magnitude (from units to millions) and describe processes of growth or multiplication. For example, the population of cities, stock prices, lake areas. The number 1 leads because to move from 1 to 2, the value must increase by 100%, while from 8 to 9 — only by 12.5%. The system “sticks” to numbers starting with 1 longer.
Application: Tax and financial authorities around the world use Benford's Law to detect suspicious reporting and falsified data, as a person inventing numbers intuitively tends to a uniform d ...
Read more