History of statistics
, in the modern sense of the word, began evolving in the 18th century in response to the novel needs of industrializing sovereign states.
In early times, the meaning was restricted to information about states, particularly demographics such as population. This was later extended to include all collections of information of all types, and later still it was extended to include the analysis and interpretation of such data. In modern terms, "statistics" means both sets of collected information, as in national accounts and temperature record, and analytical work which requires statistical inference. Statistical activities are often associated with models expressed using probabilities, hence the connection with probability theory. The large requirements of data processing have made statistics a key application of computing. A number of statistical concepts have an important impact on a wide range of sciences. These include the design of experiments and approaches to statistical inference such as Bayesian inference, each of which can be considered to have their own sequence in the development of the ideas underlying modern statistics.
Introduction
By the 18th century, the term "statistics" designated the systematic collection of demographic and economic data by states. For at least two millennia, these data were mainly tabulations of human and material resources that might be taxed or put to military use. In the early 19th century, collection intensified, and the meaning of "statistics" broadened to include the discipline concerned with the collection, summary, and analysis of data. Today, data is collected and statistics are computed and widely distributed in government, business, most of the sciences and sports, and even for many pastimes. Electronic computers have expedited more elaborate statistical computation even as they have facilitated the collection and aggregation of data. A single data analyst may have available a set of data-files with millions of records, each with dozens or hundreds of separate measurements. These were collected over time from computer activity or from computerized sensors, point-of-sale registers, and so on. Computers then produce simple, accurate summaries, and allow more tedious analyses, such as those that require inverting a large matrix or perform hundreds of steps of iteration, that would never be attempted by hand. Faster computing has allowed statisticians to develop "computer-intensive" methods which may look at all permutations, or use randomization to look at 10,000 permutations of a problem, to estimate answers that are not easy to quantify by theory alone.The term "mathematical statistics" designates the mathematical theories of probability and statistical inference, which are used in statistical practice. The relation between statistics and probability theory developed rather late, however. In the 19th century, statistics increasingly used probability theory, whose initial results were found in the 17th and 18th centuries, particularly in the analysis of games of chance. By 1800, astronomy used probability models and statistical theories, particularly the method of least squares. Early probability theory and statistics was systematized in the 19th century and statistical reasoning and probability models were used by social scientists to advance the new sciences of experimental psychology and sociology, and by physical scientists in thermodynamics and statistical mechanics. The development of statistical reasoning was closely associated with the development of inductive logic and the scientific method, which are concerns that move statisticians away from the narrower area of mathematical statistics. Much of the theoretical work was readily available by the time computers were available to exploit them. By the 1970s, Johnson and Kotz produced a four-volume Compendium on Statistical Distributions, which is still an invaluable resource.
Applied statistics can be regarded as not a field of mathematics but an autonomous mathematical science, like computer science and operations research. Unlike mathematics, statistics had its origins in public administration. Applications arose early in demography and economics; large areas of micro- and macro-economics today are "statistics" with an emphasis on time-series analyses. With its emphasis on learning from data and making best predictions, statistics also has been shaped by areas of academic research including psychological testing, medicine and epidemiology. The ideas of statistical testing have considerable overlap with decision science. With its concerns with searching and effectively presenting data, statistics has overlap with information science and computer science.
Etymology
The term statistics is ultimately derived from the Neo-Latin statisticum collegium and the Italian word statista. The German Statistik, first introduced by Gottfried Achenwall, originally designated the analysis of data about the state, signifying the "science of state". It acquired the meaning of the collection and classification of data generally in the early 19th century. It was introduced into English in 1791 by Sir John Sinclair when he published the first of 21 volumes titled Statistical Account of Scotland.Origins in probability theory
Basic forms of statistics have been used since the beginning of civilization. Early empires often collated censuses of the population or recorded the trade in various commodities. The Han dynasty and the Roman Empire were some of the first states to extensively gather data on the size of the empire's population, geographical area and wealth.The use of statistical methods dates back to at least the 5th century BCE. The historian Thucydides in his History of the Peloponnesian War describes how the Athenians calculated the height of the wall of Platea by counting the number of bricks in an unplastered section of the wall sufficiently near them to be able to count them. The count was repeated several times by a number of soldiers. The most frequent value so determined was taken to be the most likely value of the number of bricks. Multiplying this value by the height of the bricks used in the wall allowed the Athenians to determine the height of the ladders necessary to scale the walls.
The Trial of the Pyx is a test of the purity of the coinage of the Royal Mint which has been held on a regular basis since the 12th century. The Trial itself is based on statistical sampling methods. After minting a series of coins – originally from ten pounds of silver – a single coin was placed in the Pyx – a box in Westminster Abbey. After a given period – now once a year – the coins are removed and weighed. A sample of coins removed from the box are then tested for purity.
The Nuova Cronica, a 14th-century history of Florence by the Florentine banker and official Giovanni Villani, includes much statistical information on population, ordinances, commerce and trade, education, and religious facilities and has been described as the first introduction of statistics as a positive element in history, though neither the term nor the concept of statistics as a specific field yet existed.
The arithmetic mean, although a concept known to the Greeks, was not generalised to more than two values until the 16th century. The invention of the decimal system by Simon Stevin in 1585 seems likely to have facilitated these calculations. This method was first adopted in astronomy by Tycho Brahe who was attempting to reduce the errors in his estimates of the locations of various celestial bodies.
The idea of the median originated in Edward Wright's book on navigation in 1599 in a section concerning the determination of location with a compass. Wright felt that this value was the most likely to be the correct value in a series of observations. The difference between the mean and the median was noticed in 1669 by Chistiaan Huygens in the context of using Graunt's tables.
The term 'statistic' was introduced by the Italian scholar Girolamo Ghilini in 1589 with reference to this science. The birth of statistics is often dated to 1662, when John Graunt, along with William Petty, developed early human statistical and census methods that provided a framework for modern demography. He produced the first life table, giving probabilities of survival to each age. His book Natural and Political Observations Made upon the Bills of Mortality used analysis of the mortality rolls to make the first statistically based estimation of the population of London. He knew that there were around 13,000 funerals per year in London and that three people died per eleven families per year. He estimated from the parish records that the average family size was 8 and calculated that the population of London was about 384,000; this is the first known use of a ratio estimator. Laplace in 1802 estimated the population of France with a similar method; see for details.
Although the original scope of statistics was limited to data useful for governance, the approach was extended to many fields of a scientific or commercial nature during the 19th century. The mathematical foundations for the subject heavily drew on the new probability theory, pioneered in the 16th century by Gerolamo Cardano, Pierre de Fermat and Blaise Pascal. Christiaan Huygens gave the earliest known scientific treatment of the subject. Jakob Bernoulli's Ars Conjectandi and Abraham de Moivre's The Doctrine of Chances treated the subject as a branch of mathematics. In his book Bernoulli introduced the idea of representing complete certainty as one and probability as a number between zero and one.
In 1700, Isaac Newton carried out the earliest known form of linear regression, writing the first of the ordinary least squares normal equations, averaging astronomical data, and summing the residuals to zero in his analysis of Hipparchus’s equinox observations. He distinguished between two inhomogeneous sets of data and might have thought of an optimal solution in terms of bias, but not in effectiveness.
A key early application of statistics in the 18th century was to the human sex ratio at birth. John Arbuthnot studied this question in 1710. Arbuthnot examined birth records in London for each of the 82 years from 1629 to 1710. In every year, the number of males born in London exceeded the number of females. Considering more male or more female births as equally likely, the probability of the observed outcome is 0.5^82, or about 1 in 4,8360,0000,0000,0000,0000,0000; in modern terms, the p-value. This is vanishingly small, leading Arbuthnot that this was not due to chance, but to divine providence: "From whence it follows, that it is Art, not Chance, that governs." This is and other work by Arbuthnot is credited as "the first use of significance tests" the first example of reasoning about statistical significance and moral certainty, and "... perhaps the first published report of a nonparametric test...", specifically the sign test; see details at.
The formal study of theory of errors may be traced back to Roger Cotes' Opera Miscellanea, but a memoir prepared by Thomas Simpson in 1755 first applied the theory to the discussion of errors of observation. The reprint of this memoir lays down the axioms that positive and negative errors are equally probable, and that there are certain assignable limits within which all errors may be supposed to fall; continuous errors are discussed and a probability curve is given. Simpson discussed several possible distributions of error. He first considered the uniform distribution and then the discrete symmetric triangular distribution followed by the continuous symmetric triangle distribution. Tobias Mayer, in his study of the libration of the moon, invented the first formal method for estimating the unknown quantities by generalized the averaging of observations under identical circumstances to the averaging of groups of similar equations.
Roger Joseph Boscovich in 1755 based in his work on the shape of the earth proposed in his book De Litteraria expeditione per pontificiam ditionem ad dimetiendos duos meridiani gradus a PP. Maire et Boscovicli that the true value of a series of observations would be that which minimises the sum of absolute errors. In modern terminology this value is the median. The first example of what later became known as the normal curve was studied by Abraham de Moivre who plotted this curve on November 12, 1733. de Moivre was studying the number of heads that occurred when a 'fair' coin was tossed.
In 1763 Richard Price transmitted to the Royal Society Thomas Bayes proof of a rule for using a binomial distribution to calculate a posterior probability on a prior event.
In 1765 Joseph Priestley invented the first timeline charts.
Johann Heinrich Lambert in his 1765 book Anlage zur Architectonic proposed the semicircle as a distribution of errors:
with -1 < x < 1.
Pierre-Simon Laplace made the first attempt to deduce a rule for the combination of observations from the principles of the theory of probabilities. He represented the law of probability of errors by a curve and deduced a formula for the mean of three observations.
Laplace in 1774 noted that the frequency of an error could be expressed as an exponential function of its magnitude once its sign was disregarded. This distribution is now known as the Laplace distribution. Lagrange proposed a parabolic fractal distribution of errors in 1776.
Laplace in 1778 published his second law of errors wherein he noted that the frequency of an error was proportional to the exponential of the square of its magnitude. This was subsequently rediscovered by Gauss and is now best known as the normal distribution which is of central importance in statistics. This distribution was first referred to as the normal distribution by C. S. Peirce in 1873 who was studying measurement errors when an object was dropped onto a wooden base. He chose the term normal because of its frequent occurrence in naturally occurring variables.
Lagrange also suggested in 1781 two other distributions for errors – a raised cosine distribution and a logarithmic distribution.
Laplace gave a formula for the law of facility of error, but one which led to unmanageable equations. Daniel Bernoulli introduced the principle of the maximum product of the probabilities of a system of concurrent errors.
In 1786 William Playfair introduced the idea of graphical representation into statistics. He invented the line chart, bar chart and histogram and incorporated them into his works on economics, the Commercial and Political Atlas. This was followed in 1795 by his invention of the pie chart and circle chart which he used to display the evolution of England's imports and exports. These latter charts came to general attention when he published examples in his Statistical Breviary in 1801.
Laplace, in an investigation of the motions of Saturn and Jupiter in 1787, generalized Mayer's method by using different linear combinations of a single group of equations.
In 1791 Sir John Sinclair introduced the term 'statistics' into English in his Statistical Accounts of Scotland.
In 1802 Laplace estimated the population of France to be 28,328,612. He calculated this figure using the number of births in the previous year and census data for three communities. The census data of these communities showed that they had 2,037,615 persons and that the number of births were 71,866. Assuming that these samples were representative of France, Laplace produced his estimate for the entire population.
File:Bendixen - Carl Friedrich Gauß, 1828.jpg|thumb|Carl Friedrich Gauss, mathematician who developed the method of least squares in 1809
The method of least squares, which was used to minimize errors in data measurement, was published independently by Adrien-Marie Legendre, Robert Adrain, and Carl Friedrich Gauss. Gauss had used the method in his famous 1801 prediction of the location of the dwarf planet Ceres. The observations that Gauss based his calculations on were made by the Italian monk Piazzi.
The method of least squares was preceded by the use a median regression slope. This method minimizing the sum of the absolute deviances. A method of estimating this slope was invented by Roger Joseph Boscovich in 1760 which he applied to astronomy.
The term probable error – the median deviation from the mean – was introduced in 1815 by the German astronomer Frederik Wilhelm Bessel. Antoine Augustin Cournot in 1843 was the first to use the term median for the value that divides a probability distribution into two equal halves.
Other contributors to the theory of errors were Ellis, De Morgan, Glaisher, and Giovanni Schiaparelli. Peters's formula for, the "probable error" of a single observation was widely used and inspired early robust statistics.
In the 19th century authors on statistical theory included Laplace, S. Lacroix, Littrow, Dedekind, Helmert, Laurent, Liagre, Didion, De Morgan and Boole.
Gustav Theodor Fechner used the median in sociological and psychological phenomena. It had earlier been used only in astronomy and related fields. Francis Galton used the English term median for the first time in 1881 having earlier used the terms middle-most value in 1869 and the medium in 1880.
Adolphe Quetelet, another important founder of statistics, introduced the notion of the "average man" as a means of understanding complex social phenomena such as crime rates, marriage rates, and suicide rates.
The first tests of the normal distribution were invented by the German statistician Wilhelm Lexis in the 1870s. The only data sets available to him that he was able to show were normally distributed were birth rates.