Karl Pearson

<p>This thesis examines the development of modern statistical theory and its emergence as a highly specialised mathematical discipline at the end of the nineteenth century. The statistical work of the mathematician and statistician Karl Pearson (1857-1936), who almost singularly created the m...

Full description

Bibliographic Details
Main Author: Magnello, ME
Format: Thesis
Language:English
Published: 1994
Subjects:
Description
Summary:<p>This thesis examines the development of modern statistical theory and its emergence as a highly specialised mathematical discipline at the end of the nineteenth century. The statistical work of the mathematician and statistician Karl Pearson (1857-1936), who almost singularly created the modern theory of statistics, is the focus of the thesis. The impact of the statistical and experimental work of the Darwinian zoologist W.F.R. Weldon (1860-1906), on the emergence and construction of Pearsonian statistical innovation, is central to the arguments developed in this thesis. Contributions to the Pearsonian corpus from such statisticians as Francis Ysidro Edgeworth (1845-1926), Francis Galton (1822-1911), and George Udny Yule (1871- 1951) are also addressed.</p> <p>The scope of the thesis does not involve a detailed account of every technical contribution that Pearson made to statistics. Instead, it provides a unifying assessment of Pearson's most seminal and innovative contributions to modern statistical theory devised in the Biometric School, at University College London, from 1892 to 1903. An assessment of Pearson's statistical contributions also entails a comprehensive examination of the two separate methodologies he developed in the Drapers' Biometric Laboratory (from 1903 to 1933) and in the Galton Eugenics Laboratory (from 1907 to 1933).</p> <p>This thesis arises, in part, from a desire to reassess the state of the historiography of Pearsonian statistics over the course of the last half century. Some of the earliest work on Pearson came from his former students who emphasised his achievements as a statistician usually from the perspective of the state of the discipline in their tune. The conventional view has presumed that Pearson's relationship with Galton and thus to Gallon's work on <em>simple correlation, simple regression</em>, inheritance and eugenics provided the impetus to Pearson's own statistical work. This approach, which focuses on a part of Pearson's statistical work, has provided minimal insight into the complexity of the totality of Pearsonian statistics.</p> <p>Another approach, derived from the sociology of knowledge in the 1970s, espoused this conventional view and linked Pearson's statistical work to eugenics by placing his work in a wider context of social and political ideologies. This has usually entailed frequent recourse to Pearson's social and political views <em>vis-a-vis</em> his popular writings on eugenics. This approach, whilst indicating the political and social dimensions of science, has produced a rather mono-causal or uni-dimensional view of history. The crucial question of the relation between his technical contributions and his ideology in the construction of his statistical methods has not yet been adequately considered.</p> <p>This thesis argues that the impetus to Pearson's earliest statistical work was given by his efforts to tackle the problems of asymmetrical biological distributions (arising from Weldon's dimorphic distribution of the female shore crab in the Bay of Naples). Furthermore, it argues that the fundamental developments and construction of Pearsonian statistics arose from the Darwinian biological concepts at the centre of Weldon's statistical and experimental work on marine organisms in Naples and in Plymouth. Charles Darwin's recognition that species comprised different sets of 'statistical' populations (rather than consisting of 'types' or 'essences') led to a reconceptualisation of statistical populations by Pearson and Weldon which, in turn, led to their attempts to find a statistical resolution of the pre-Darwinian Aristotelian essentialistic concept of species. Pearson's statistical developments thus involved a greater consideration of speciation and of Darwin's theory of natural selection than hitherto considered. This has, therefore, entailed a reconstruction of the totality of Pearsonian statistics to identify the mathematical and biological developments that underpinned his work and to determine other sources of influence in this development.</p> <p>Pearson's writings are voluminous: as principal author he published more than 540 papers and books of which 361 are statistical. The other publications include 67 literary and historical writings, 49 eugenics publications, 36 pure mathematics and physics papers and 27 reports on university matters. He also published at least 111 letters, notes and book reviews. His collected papers and letters at University College London consist of 235 boxes of family papers, scientific manuscripts and 14,000 letters. One of the most extensive sets of letters in the collection are those of W.F.R. Weldon and his wife, Florence Joy Weldon, which consists of nearly 1,000 pieces of correspondence. No published work on Pearson to date has properly utilised the correspondence between Pearson and the Weldons. Particular emphasis has been given to this collection as these letters indicate (in tandem with Pearson's Gresham lectures and the seminal statistical published papers) that Pearson's earliest statistical work started in 1892 (rather than 1895-1896) and that Weldon's influence and work during these years was decisive in the development and advancement of Pearsonian statistics.</p> <p>The approach adopted in this thesis is essentially that of an intellectual biography which is thematic and is broadly chronological. This approach has been adopted to make greater use of primary sources in an attempt to provide a more historically sensitive interpretation of Pearson's work than has been used previously. It has thus been possible to examine these three (as yet unexamined) key Pearsonian developments: (1) his earliest statistical work (from 1892 to 1895), (2) his joint biometrical projects with Weldon (from 1898-1906) and a shift in the focus of research in the Drapers' Biometric Laboratory following Weldon's death in 1906 and (3) the later work in the twentieth century when he established the two laboratories which were underpinned by two separate methodologies.</p> <p>The arguments, which follow a chronological progression, have been built around Darwin's ideas of biological variation, 'statistical' populations, his theory of natural selection and Galton's law of ancestral inheritance. The first two chapters provide background material to the arguments developed in the thesis. Weldon's use of correlation (for the identification of species) in 1889 is examined in Chaper III. It is argued, that Pearson's analysis of Weldon's dimorphic distribution led to their work on speciation which led on to Pearson's earliest innovative statistical work. Weldon's most productive research with Pearson, discussed in Chapter IV, came to fruition when he showed empirical evidence of natural selection by detecting disturbances (or deviations) in the distribution from normality as a consequence of differential mortality rates. This research enabled Pearson to further develop his theory of frequency distributions.</p> <p>The central part of the thesis broadens out to examine further issues not adequately examined. Galton's statistical approach to heredity is addressed in Chapter V, and it is shown that Galton adumbrated Pearson's work on <em>multiple correlation</em> and <em>multiple regression</em> with his law of ancestral heredity. This work, in conjunction with Weldon's work on natural selection, led to Pearson's introduction of the use of determinantal matrix algebra into statistical theory in 1896: this (much neglected) development was pivotal in the professionalisation of the emerging discipline of mathematical statistics.</p> <p>Pearson's work on goodness of fit testing provided the machinery for reconstructing his most comprehensive statistical work which spanned four decades and encompassed his entire working life as a statistician. Thus, a greater part of Pearsonian statistics has been examined than in previous studies. This work, which is assessed in Chapter VI, began in 1892 when he used the sixth-moment (devised from his <em>method of moments</em>) for a measure of goodness of fit for problems of speciation, continued throughout the 1890s hi his attempt to find an empirical measure of a goodness of fit test for problems of natural selection, culminated in 1900 when he devised the chi-square (ϰ<sup>2</sup>, P) goodness of fit test (his single most significant contribution to modern statistical theory) and ended with the last paper he wrote six weeks before his death on 27 April 1936. In 1904 Pearson devised the 'square contingency coefficient' (i.e., the <em>chi-square statistic</em>).</p> <p>The penultimate chapter examines the methodologies underpinning the infrastructure of the Drapers' Biometric Laboratory (which was built upon the statistical methods devised in the Biometric School) and the Galton Eugenics Laboratory which involved the use of a complex interconnecting set of family pedigrees and actuarial death rates. It is concluded that in an effort to redress the balance in the historiography of Pearsonian statistics, these three key developments need careful consideration: Pearson's earliest statistical work on goodness of fit testing, the two separate Pearsonian methodologies used in his laboratories and his working relationship with his colleague and 'closest friend' W.F.R. Weldon.</p>