Misuse of statistics
, when used in a misleading fashion, can trick the casual observer into believing something other than what the data shows. That is, a misuse of statistics occurs when
a statistical argument asserts a falsehood. In some cases, the misuse may be accidental. In others, it is purposeful and for the gain of the perpetrator. When the statistical reason involved is false or misapplied, this constitutes a statistical fallacy.
The consequences of such misinterpretations can be quite severe. For example, in medical science, correcting a falsehood may take decades and cost lives; likewise, in democratic societies, misused statistics can distort public understanding, entrench misinformation, and enable governments to implement harmful policies without accountability.
Misuses can be easy to fall into. Professional scientists, mathematicians and even professional statisticians, can be fooled by even some simple methods, even if they are careful to check everything. Scientists have been known to fool themselves with statistics due to lack of knowledge of probability theory and lack of standardization of their tests.
Definition, limitations and context
One usable definition is: "Misuse of Statistics: Using numbers in such a manner that – either by intent or through ignorance or carelessness – the conclusions are unjustified or incorrect." The "numbers" include misleading graphics discussed in other sources. The term is not commonly encountered in statistics texts and there is no single authoritative definition. It is a generalization of lying with statistics which was richly described by examples from statisticians 60 years ago.The definition confronts some problems :
- Statistics usually produces probabilities; conclusions are provisional
- The provisional conclusions have errors and error rates. Commonly 5% of the provisional conclusions of significance testing are wrong
- Statisticians are not in complete agreement on ideal methods
- Statistical methods are based on assumptions which are seldom fully met
- Data gathering is usually limited by ethical, practical and financial constraints.
An insidious misuse of statistics is completed by the listener, observer, audience, or juror. The supplier provides the "statistics" as numbers or graphics, allowing the consumer to draw conclusions that may be unjustified or incorrect. The poor state of public statistical literacy and the non-statistical nature of human intuition make it possible to mislead without explicitly producing faulty conclusion. The definition is weak on the responsibility of the consumer of statistics.
A historian listed over 100 fallacies in a dozen categories including those of generalization and those of causation. A few of the fallacies are explicitly or potentially statistical including sampling, statistical nonsense, statistical probability, false extrapolation, false interpolation and insidious generalization. All of the technical/mathematical problems of applied probability would fit in the single listed fallacy of statistical probability. Many of the fallacies could be coupled to statistical analysis, allowing the possibility of a false conclusion flowing from a statistically sound analysis.
An example use of statistics is in the analysis of medical research. The process includes experimental planning, the conduct of the experiment, data analysis, drawing the logical conclusions and presentation/reporting. The report is summarized by the popular press and by advertisers. Misuses of statistics can result from problems at any step in the process. The statistical standards ideally imposed on the scientific report are much different than those imposed on the popular press and advertisers; however, cases exist of advertising disguised as science, such as Australasian Journal of Bone & Joint Medicine. The definition of the misuse of statistics is weak on the required completeness of statistical reporting. The opinion is expressed that newspapers must provide at least the source for the statistics reported.
Simple causes
Many misuses of statistics occur because- The source is a subject matter expert, not a statistics expert. The source may incorrectly use a method or interpret a result.
- The source is a statistician, not a subject matter expert. An expert should know when the numbers being compared describe different things. Numbers change, as reality does not, when legal definitions or political boundaries change.
- The subject being studied is not well defined, or some of its aspects are easy to quantify while others hard to quantify or there is no known quantification method. For example:
- * While IQ tests are available and numeric, it is difficult to define what they measure, as intelligence is an elusive concept.
- * Publishing "impact" has the same problem. Scientific papers and scholarly journals are often rated by "impact", quantified as the number of citations by later publications. Mathematicians and statisticians conclude that impact is not a very meaningful measure. "The sole reliance on citation data provides at best an incomplete and often shallow understanding of an understanding that is valid only when reinforced by other judgments. Numbers are not inherently superior to sound judgments."
- * A seemingly simple question about the number of words in the English language immediately encounters questions about archaic forms, accounting for prefixes and suffixes, multiple definitions of a word, variant spellings, dialects, fanciful creations, technical vocabulary, and so on.
- Data quality is poor. Apparel provides an example. People have a wide range of sizes and body shapes. It is obvious that apparel sizing must be multidimensional. Instead it is complex in unexpected ways. Some apparel is sold by size only, sizes vary by country and manufacturer and some sizes are deliberately misleading. While sizes are numeric, only the crudest of statistical analyses is possible using the size numbers with care.
- The popular press has limited expertise and mixed motives. If the facts are not "newsworthy" they may not be published. The motives of advertisers are even more mixed.
- "Politicians use statistics in the same way that a drunk uses lamp posts—for support rather than illumination" – Andrew Lang "What do we learn from these two ways of looking at the same numbers? We learn that a clever propagandist, right or left, can almost always find a way to present the data on economic growth that seems to support her case. And we therefore also learn to take any statistical analysis from a strongly political source with handfuls of salt." The term statistics originates from numbers generated for and utilized by the state. Good government may require accurate numbers, but popular government may require supportive numbers. "The use and misuse of statistics by governments is an ancient art."
Types of misuse
Discarding unfavorable observations
To promote a neutral product, a company must find or conduct, for example, 40 studies with a confidence level of 95%. If the product is useless, this would produce one study showing the product was beneficial, one study showing it was harmful, and thirty-eight inconclusive studies. This tactic becomes more effective when there are more studies available. Organizations that do not publish every study they carry out, such as tobacco companies denying a link between smoking and cancer, anti-smoking advocacy groups and media outlets trying to prove a link between smoking and various ailments, or miracle pill vendors, are likely to use this tactic.Ronald Fisher considered this issue in his famous lady tasting tea example experiment. Regarding repeated experiments, he said, "It would be illegitimate and would rob our calculation of its basis if unsuccessful results were not all brought into the account."
Another term related to this concept is cherry picking.
Ignoring important features
Multivariable datasets have two or more features/dimensions. If too few of these features are chosen for analysis, the results can be misleading. This leaves the analyst vulnerable to any of various statistical paradoxes, or in some cases false causality as below.Loaded questions
The answers to surveys can often be manipulated by wording the question in such a way as to induce a prevalence towards a certain answer from the respondent. For example, in polling support for a war, the questions:- Do you support the attempt by the US to bring freedom and democracy to other places in the world?
- Do you support the unprovoked military action by the USA?
Another way to do this is to precede the question by information that supports the "desired" answer. For example, more people will likely answer "yes" to the question "Given the increasing burden of taxes on middle-class families, do you support cuts in income tax?" than to the question "Considering the rising federal budget deficit and the desperate need for more revenue, do you support cuts in income tax?"
The proper formulation of questions can be very subtle, but nonetheless can yield significant differences in results. Additionally, the responses to two questions can vary dramatically depending on the order in which they are asked. "A survey that asked about 'ownership of stock' found that most Texas ranchers owned stock, though probably not the kind traded on the New York Stock Exchange."