Naked Statistics

Charles Wheelan

Guide cover image

Naked Statistics: Stripping the Dread from the Data

Charles Wheelan

57 pages • 1-hour read

Charles Wheelan

Naked Statistics: Stripping the Dread from the Data

Nonfiction | Reference/Text Book | Adult | Published in 2012

A modern alternative to SparkNotes and CliffsNotes, SuperSummary offers high-quality Study Guides with detailed chapter summaries and analysis of major themes, characters, and more.

Summary

Background

Chapter Summaries & Analyses

Introduction-Chapter 3

Chapters 4-6

Chapters 7-10

Chapter 11-Conclusion

Key Figures

Themes

Index of Terms

Important Quotes

Essay Topics

Book Club Questions

Tools

Discussion Questions

Important Quotes

“Statistics is like a high-caliber weapon: helpful when used correctly and potentially disastrous in the wrong hands.”

(Introduction, Page xvi)

The introduction ends on a simile establishing the book’s central argument and purpose. By comparing statistics to a weapon, the author frames statistical knowledge as a source of power that carries significant responsibility. This framing introduces the dual nature of the book’s project: to demonstrate the helpful applications of statistical analysis while simultaneously warning against its potential for misuse, intentional or not, thereby setting up the theme that Statistical Literacy Is Empowering.

“[T]he most important thing to recognize is that the Gini index is just like the passer rating. It’s a handy tool for collapsing complex information into a single number.”

(Chapter 1, Page 2)

This analogy demystifies a seemingly complex economic tool, the Gini index, by comparing it to a familiar sports statistic. This technique is a key element of the book’s pedagogical style, making abstract concepts accessible by grounding them in everyday contexts. By showing that both the Gini index and the passer rating serve the same function—simplifying a multifaceted reality into a single, comparable figure—the author illustrates the universal utility and purpose of descriptive statistics.

“If I were to describe the patrons of this bar as having an average annual income of $91 million, the statement would be both statistically correct and grossly misleading.”

(Chapter 2, Page 19)

Wheelan often present funny scenarios to illustrate the concepts he is presenting, aiming to make abstract concepts intuitive to the reader. Here, after presenting the hypothetical of Bill Gates walking into a bar, Wheelan crystallizes the distinction between the mean and the median. The statement’s paradoxical nature—being simultaneously “correct and grossly misleading”—highlights how easily statistics can be used to obscure truth, showing how Statistics Can Mislead or Be Manipulated.

“The beauty of the normal distribution […] comes from the fact that we know by definition exactly what proportion of the observations in a normal distribution lie within one standard deviation of the mean (68.2 percent), within two standard deviations of the mean (95.4 percent), within three standard deviations (99.7 percent), and so on.”

(Chapter 2, Page 26)

This fundamental rule of the normal distribution is a cornerstone of statistical analysis that recurs throughout the book. The author’s language, such as “beauty,” signals the elegance and importance of this mathematical concept. By providing specific, fixed percentages (68, 95, 99.7), the text codifies the relationship between the standard deviation and the distribution of data, laying the groundwork for more complex concepts like inference and probability.

“Accuracy is a measure of whether a figure is broadly consistent with the truth—hence the danger of confusing precision with accuracy. […] [N]o amount of precision can make up for inaccuracy.”

(Chapter 3, Page 37)

Here, the author makes a crucial technical distinction that underpins the chapter’s argument about statistical deception. By defining and contrasting the terms “precision” and “accuracy,” the passage illuminates how a number can appear exact without being correct, a key form of statistical misuse. The final declarative statement functions as an aphorism, a memorable rule for the reader to use when critically evaluating statistical claims.

“It’s entirely possible for most of the students to be improving and most of the schools to be getting worse—if the students showing improvement happen to be in very big schools.”

(Chapter 3, Page 40)

Wheelan resolves a hypothetical paradox posed by two competing claims about school performance. It serves as a concise explanation of how changing the unit of analysis—in this case, from schools to students—can produce contradictory but technically true statements from the same data. By constructing this logical puzzle and its resolution, the author demonstrates a common method of statistical manipulation in an accessible way.

“The sad paradox of this seemingly helpful descriptive statistic is that cardiologists responded rationally by withholding care from the patients who needed it most.”

(Chapter 3, Page 55)

In a discussion of a real-world example about cardiologists being publicly graded on their patient mortality rates, Wheelan uses the term “sad paradox” to highlight the ironic and dangerous outcome of a well-intentioned statistical measure. The analysis reveals how performance metrics can create perverse incentives, where individuals work to improve the statistic itself rather than the underlying goal, in this case doctors avoiding high-risk patients to protect their scores.

“One crucial point in this general discussion is that correlation does not imply causation; a positive or negative association between two variables does not necessarily mean that a change in one of the variables is causing the change in the other.”

(Chapter 4, Page 63)

After articulating a foundational principle of statistical analysis, Wheelan immediately illustrates it with a relatable example involving televisions and SAT scores. This structure allows him to explain the concept of a confounding third variable (parental income) in a clear and accessible way, empowering the reader to avoid common analytical fallacies.

“Still, the system is just a super fancy variation on what people have been doing since the dawn of film: find someone with similar tastes and ask for a recommendation. […] That is the essence of correlation.”

(Chapter 4, Page 65)

This passage frames a complex algorithm as a technologically advanced version of a simple human behavior. Through this analogy, the author makes the abstract concept of correlation tangible and intuitive for a non-expert audience. The passage exemplifies the book’s central purpose: stripping away intimidating complexity to reveal the straightforward logic underlying powerful statistical tools.

“Probabilities do not tell us what will happen for sure; they tell us what is likely to happen and what is less likely to happen. Sensible people can make use of these kinds of numbers in business and life.”

(Chapter 5, Page 72)

Here, the author uses parallel structure (“what will happen” vs. “what is likely to happen”) to define both the power and the limitations of probability. The second sentence frames probability as a practical instrument for rational decision-making under conditions of uncertainty, central to the theme of Probability as a Tool for Better Decisions, which positions statistics as a tool for managing risk rather than eliminating it.

“An important theorem known as the law of large numbers tells us that as the number of independent trials increases, the average of the outcomes will get closer and closer to its expected value.”

(Chapter 5, Pages 78-79)

The author introduces a foundational theorem of probability using clear, non-mathematical language. By immediately applying the law to real-world examples like lottery tickets and casino operations, he demonstrates its practical power in explaining why certain enterprises are profitable over time. This technique of defining a formal concept and then grounding it in familiar contexts makes a difficult idea accessible and reinforces the book’s premise that Statistical Literacy is Empowering.

“The more broadly applicable lesson is that your gut instinct on probability can sometimes steer you astray.”

(Chapter 5.5, Page 94)

Serving as the thesis for the chapter on the Monty Hall problem, this sentence generalizes the specific lesson from the famous probability puzzle. The author uses the counterintuitive game show scenario as a case study to prove this larger point about the unreliability of intuition. This explicitly highlights a core argument of the book: that systematic, statistical reasoning is a necessary corrective to common cognitive biases.

“The primary critique of VaR is that the underlying risks associated with financial markets are not as predictable as a coin flip […] The false precision embedded in the models created a false sense of security.”

(Chapter 6, Page 97)

This passage identifies the central flaw in the risk management models that contributed to the 2008 financial crisis, directly demonstrating how Statistics Can Mislead or Be Manipulated. The author uses a simple contrast—the unpredictable stock market versus a predictable coin flip—to expose the model’s faulty assumptions. The phrase “false sense of security” captures the danger of mistaking a precise number for an accurate one, arguing that flawed statistical tools can be worse than none at all.

“He explained the calculation: Since the incidence of a cot death is rare, 1 in 8,500, the chance of having two cot deaths in the same family would be (1/8,500)², which is roughly 1 in 73 million.”

(Chapter 6, Page 101)

This seemingly logical calculation from an “expert witness” is a case study in statistical error, exemplifying how a misapplication of a basic probability rule—multiplying probabilities of events that are not independent—can create a misleadingly definitive statistic. This real-world example demonstrates the high stakes of statistical literacy, showing how a mistake presented with authority can lead to a profound miscarriage of justice.

“Data are to statistics what a good offensive line is to a star quarterback. […] without them, you won’t ever see a star quarterback. […] no amount of fancy analysis can make up for fundamentally flawed data. Hence the expression ‘garbage in, garbage out.’”

(Chapter 7, Page 111)

This analogy compares data to an offensive line in football to establish in an accessible way that sophisticated statistical analysis is worthless without high-quality inputs. The reference to the idiom “garbage in, garbage out” distills the chapter’s central idea into a memorable phrase, underscoring that data collection, not calculation, is often the most critical stage of statistical work.

“The diagnosis of breast cancer had not just changed a woman’s present and the future; it had altered her past. Women with breast cancer had (unconsciously) decided that a higher-fat diet was a likely predisposition for their disease and (unconsciously) recalled a high-fat diet.”

(Chapter 7, Page 122)

In this example of recall bias, the author employs diction like “altered her past” to illustrate how a present condition can retroactively corrupt memory, thereby producing flawed data. The repetition of “unconsciously” emphasizes the subtle, non-deliberate nature of this bias, making it a particularly insidious problem for researchers. The sentence structure builds from a general statement to a specific explanation, effectively demonstrating how human psychology can undermine statistical accuracy.

“Much of it comes from the central limit theorem, which is the Lebron James of statistics—if Lebron were also a supermodel, a Harvard professor, and the winner of the Nobel Peace Prize. The central limit theorem is the ‘power source’ for many of the statistical activities that involve using a sample to make inferences about a large population.”

(Chapter 8, Page 127)

The author uses a hyperbolic analogy and pop culture reference to introduce the central limit theorem, a complex concept. The tangible and memorable references establish CLT’s importance. The metaphor that the theorem is a “power source” communicates its function as the engine behind statistical inference, aligning with the book’s goal of demystifying statistics.

“If you draw large, random samples from any population, the means of those samples will be distributed normally around the population mean (regardless of what the distribution of the underlying population looks like).”

(Chapter 8, Page 141)

This concise, technical summary presents the central limit theorem’s core principle stripped of metaphor. The choice to present the concept so directly highlights its universal applicability. The parenthetical, “(regardless of what the distribution of the underlying population looks like),” clarifies the most powerful and often counterintuitive aspect of the theorem, showing how it allows for reliable inference even when the characteristics of the full population are unknown or non-normal.

“Statistics cannot prove anything with certainty. Instead, the power of statistical inference derives from observing some pattern or outcome and then using probability to determine the most likely explanation for that outcome.”

(Chapter 9, Page 144)

The use of italics for emphasis underscores the book’s goal of correcting common misconceptions about the nature of statistical proof. This statement reframes statistics as a structured method for evaluating evidence based on probability. This clarification underpins the idea that Statistical Literacy is Empowering, as it equips the reader to understand the probabilistic—not deterministic—conclusions offered by data analysis.

“Bad polling results do not typically stem from bad math when calculating the standard errors. Bad polling results typically stem from a biased sample, or bad questions, or both. The mantra ‘garbage in, garbage out’ applies in spades when it comes to sampling public opinion.”

(Chapter 10, Page 178)

This quote reinforces the argument that the greatest statistical dangers often lie in methodology, not complex calculations, linking back to the core idea of Chapter 7. By stating that polling errors arise from “a biased sample, or bad questions,” the author highlights non-mathematical challenges that can render a precise-sounding result meaningless, exemplifying ways in which Statistics Can Mislead or Be Manipulated, and reminding the reader that the accuracy of a poll’s conclusion is entirely dependent on the quality of its underlying data.

“Regression analysis is the statistical tool that helps us deal with this challenge. Specifically, regression analysis allows us to quantify the relationship between a particular variable and an outcome that we care about while controlling for other factors.”

(Chapter 11, Page 186)

This concise, functional definition of regression analysis uses technical but not overly intimidating language like “quantify the relationship” and “controlling for other factors”—a choice that shows readers how much they’ve already learned by this point in the book. This framing explains how statisticians can isolate a single relationship from a web of confounding variables, a concept illustrated throughout the chapter.

“Regression analysis is the hydrogen bomb of the statistics arsenal. […] In the wrong hands, regression analysis will yield results that are misleading or just plain wrong. And, as the estrogen example illustrates, even in the right hands this powerful statistical tool can send us speeding dangerously in the wrong direction.”

(Chapter 12, Page 213)

Comparing regression analysis to a weapon of mass destruction conveys its immense power and potential for ruinously misleading results. Wheelan thus uses an appropriately cautionary tone in a chapter focused on common errors that show how Statistics Can Mislead or Be Manipulated. The passage distinguishes between intentional misuse (“wrong hands”) and unintentional error (“right hands”), emphasizing that even well-intentioned researchers can produce dangerously incorrect conclusions.

“Regression results will be misleading and inaccurate if the regression equation leaves out an important explanatory variable, particularly if other variables in the equation ‘pick up’ that effect.”

(Chapter 2, Page 218)

In defining omitted variable bias, a critical and common error in statistical analysis, Wheelan uses the colloquial phrase “‘pick up’ that effect”—imagery of incidental pollution that helps explain how an included variable can be wrongly credited with an impact that is actually caused by a missing, unmeasured factor.

“To measure the effect of any treatment or intervention, we need something to measure it against. […] clever researchers find ways to compare some treatment (e.g., going to Harvard) with the counterfactual, which is what would have happened in the absence of that treatment.”

(Chapter 13, Page 225)

This passage establishes the central challenge of program evaluation, which is the search for a proper basis of comparison to determine causality. It defines the “counterfactual” as the unobserved alternative that represents the ideal baseline for measuring an intervention’s true effect. The chapter outlines the quest for creative research designs that can successfully approximate this missing information.

“The Target statisticians had figured out that his daughter was pregnant before he did.”

(Conclusion, Page 253)

Serving as the climax of an anecdote about predictive analytics, this stark, declarative sentence demonstrates the real-world power of statistical modeling. The author distills a complex analysis into a human-scale outcome, forcing the reader to confront the ethical and privacy implications of corporate data mining. This example moves beyond theoretical concepts to show how statistical literacy is essential for navigating the modern world.

blurred text

blurred text

blurred text

Unlock every key quote and its meaning

Get 25 quotes with page numbers and clear analysis to help you reference, write, and discuss with confidence.

Cite quotes accurately with exact page numbers
Understand what each quote really means
Strengthen your analysis in essays or discussions

Get All Important Quotes