Ronald Fisher


Sir Ronald Aylmer Fisher was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. He has been described as "a genius who almost single-handedly created the foundations for modern statistical science" and "the single most important figure in 20th century statistics". In genetics, Fisher was the one to most comprehensively combine the ideas of Gregor Mendel and Charles Darwin, as his work used mathematics to combine Mendelian genetics and natural selection; this contributed to the revival of Darwinism in the early 20th-century revision of the theory of evolution known as the modern synthesis. For his contributions to biology, Richard Dawkins declared Fisher to be the greatest of Darwin's successors. He is also considered one of the founding fathers of Neo-Darwinism. According to statistician Jeffrey T. Leek, Fisher is the most influential scientist of all time on the basis of the number of citations of his contributions.
From 1919, he worked at the Rothamsted Experimental Station for 14 years; there, he analyzed its immense body of data from crop experiments since the 1840s, and developed the analysis of variance. He established his reputation there in the following years as a biostatistician. Fisher also made fundamental contributions to multivariate statistics.
Fisher founded quantitative genetics, and, together with J. B. S. Haldane and Sewall Wright, is known as one of the three principal founders of population genetics. Fisher outlined Fisher's principle, the Fisherian runaway, the sexy son hypothesis theories of sexual selection, parental investment, and also pioneered linkage analysis and gene mapping. On the other hand, as the founder of modern statistics, Fisher made countless contributions, including creating the modern method of maximum likelihood and deriving the properties of maximum likelihood estimators, fiducial inference, the derivation of various sampling distributions, founding the principles of the design of experiments, and much more. Fisher's famous 1921 paper alone has been described as "arguably the most influential article" on mathematical statistics in the twentieth century, and equivalent to "Darwin on evolutionary biology, Gauss on number theory, Kolmogorov on probability, and Adam Smith on economics", and is credited with completely revolutionizing statistics. For his influence and numerous fundamental contributions, he has been described as "the most original evolutionary biologist of the twentieth century" and as "the greatest statistician of all time". His work is further credited with later initiating the Human Genome Project. Fisher also contributed to the understanding of human blood groups.
Fisher has also been praised as a pioneer of the Information Age. His work on a mathematical theory of information ran parallel to the work of Claude Shannon and Norbert Wiener, though based on statistical theory. A concept to have come out of his work is that of Fisher information. He also had ideas about social sciences, which have been described as a "foundation for evolutionary social sciences".
Fisher held strong views on race and eugenics, insisting on racial differences, although there is debate as to whether Fisher supported scientific racism. He was the Galton Professor of Eugenics at University College London and editor of the Annals of Eugenics. Due to Fisher's association with eugenics, and in an effort to distance themselves from this legacy, a number of institutions – including his alma mater Gonville and Caius College and his former employer University College London – have removed commemorations of their links to him.

Early life and education

Fisher was born in East Finchley in London, England, into a middle-class household; his father, George, was a successful partner in Robinson & Fisher, auctioneers and fine art dealers. He was one of twins, the other being still-born, and grew up the youngest, with three sisters and one brother. From 1896 until 1904 they lived at Inverforth House in London, where English Heritage installed a blue plaque in 2002, before moving to Streatham. His mother, Kate, died from acute peritonitis when he was 14, and his father lost his business 18 months later.
Lifelong poor eyesight caused his rejection by the British Army for World War I, but also developed his ability to visualize problems in geometrical terms, not in writing mathematical solutions, or proofs. He entered Harrow School age 14 and won the school's Neeld Medal in mathematics. In 1909, he won a scholarship to study Mathematics at Gonville and Caius College, Cambridge. In 1912, he gained a First in Mathematics. In 1915 he published a paper, The evolution of sexual preference, on sexual selection and mate choice.

Career

During 1913–1919, Fisher worked as a statistician in the City of London and taught physics and maths at a sequence of public schools, at the Thames Nautical Training College, and at Bradfield College. There he settled with his new bride, Eileen Guinness, with whom he had two sons and six daughters.
In 1918 he published "The Correlation between Relatives on the Supposition of Mendelian Inheritance", in which he introduced the term variance and proposed its formal analysis. He put forward a genetics conceptual model showing that continuous variation amongst phenotypic traits measured by biostatisticians could be produced by the combined action of many discrete genes and thus be the result of Mendelian inheritance. This was the first step towards establishing population genetics and quantitative genetics, which demonstrated that natural selection could change allele frequencies in a population, reconciling its discontinuous nature with gradual evolution. Joan Box, Fisher's biographer and daughter, says that Fisher had resolved this problem already in 1911. Today, Fisher's additive model is still regularly used in genome-wide association studies.

Rothamsted Experimental Station, 1919–1933

In 1919, he began working at the Rothamsted Experimental Station in Hertfordshire, where he would remain for 14 years. He had been offered a position at the Galton Laboratory of University College London led by Karl Pearson, but instead accepted a temporary role at Rothamsted to investigate the feasibility of analysing the vast amount of crop data accumulated since 1842 from the "Classical Field Experiments". He analysed the data recorded over many years, and in 1921 published Studies in Crop Variation I, his first application of the analysis of variance. Studies in Crop Variation II written with his first assistant, Winifred Mackenzie, became the model for later ANOVA work. Later assistants who mastered and propagated Fisher's methods were Joseph Oscar Irwin, John Wishart and Frank Yates. Between 1912 and 1922 Fisher recommended, analysed and vastly popularized the maximum likelihood estimation method.
Fisher's 1924 article On a distribution yielding the error functions of several well known statistics presented Pearson's chi-squared test and William Gosset's Student's t-distribution in the same framework as the Gaussian distribution, and is where he developed Fisher's z-distribution, a new statistical method commonly used decades later as the F-distribution. He pioneered the principles of the design of experiments and the statistics of small samples and the analysis of real data.
In 1925 he published Statistical Methods for Research Workers, one of the 20th century's most influential books on statistical methods. Fisher's method is a technique for data fusion or "meta-analysis". Fisher formalized and popularized use of the p-value in statistics, which plays a central role in his approach. Fisher proposes the level p=0.05, or a 1 in 20 chance of being exceeded by chance, as a limit for statistical significance, and applies this to a normal distribution, yielding the rule of two standard deviations for statistical significance. The significance of 1.96, the approximate value of the 97.5 percentile point of the normal distribution used in probability and statistics, also originated in this book.

"The value for which P = 0.05, or 1 in 20, is 1.96 or nearly 2; it is convenient to take this point as a limit in judging whether a deviation is to be considered significant or not."

In Table 1 of the work, he gave the more precise value 1.959964.
In 1928, Fisher was the first to use diffusion equations to attempt to calculate the distribution of allele frequencies and the estimation of genetic linkage by maximum likelihood methods among populations.
In 1930, The Genetical Theory of Natural Selection was first published by Clarendon Press and is dedicated to Leonard Darwin. A core work of the neo-Darwinian modern evolutionary synthesis, it helped define population genetics, which Fisher founded alongside Sewall Wright and J. B. S. Haldane, and revived Darwin's neglected idea of sexual selection.
One of Fisher's favourite aphorisms was "Natural selection is a mechanism for generating an exceedingly high degree of improbability."
Fisher's fame grew, and he began to travel and lecture widely. In 1931, he spent six weeks at the Statistical Laboratory at Iowa State College where he gave three lectures per week, and met many American statisticians, including George W. Snedecor. He returned there again in 1936.

University College London, 1933–1943

In 1933, Fisher became the head of the Department of Eugenics at University College London. In 1934, he become editor of the Annals of Eugenics.
In 1935, he published The Design of Experiments, which was "also fundamental, statistical technique and application... The mathematical justification of the methods was not stressed and proofs were often barely sketched or omitted altogether.... led H.B. Mann to fill the gaps with a rigorous mathematical treatment". In this book Fisher also outlined the lady tasting tea, now a famous design of a statistical randomized experiment which uses Fisher's exact test and is the original exposition of Fisher's notion of a null hypothesis.
The same year he published a paper on fiducial inference and applied it to the Behrens–Fisher problem, the solution to which, proposed first by Walter Behrens and a few years later by Fisher, is the Behrens–Fisher distribution.
In 1936, he introduced the Iris flower data set as an example of discriminant analysis.
In his 1937 paper The wave of advance of advantageous genes he proposed Fisher's equation in the context of population dynamics to describe the spatial spread of an advantageous allele, and explored its travelling wave solutions. Out of this also came the Fisher–Kolmogorov equation. In 1937, he visited the Indian Statistical Institute in Calcutta, and its one part-time employee, P. C. Mahalanobis, often returning to encourage its development. He was the guest of honour at its 25th anniversary in 1957, when it had 2,000 employees.
In 1938, Fisher and Frank Yates described the Fisher–Yates shuffle in their book Statistical tables for biological, agricultural and medical research. Their description of the algorithm used pencil and paper; a table of random numbers provided the randomness.