# Stats: Data and Models, 5th edition

Published by Pearson (February 13, 2019) © 2020

**David E. Bock**Ithaca High School (Retired) , Cornell University**Paul F. Velleman**Cornell University**Richard D. De Veaux**Williams College**Floyd Bullard**North Carolina School of Science and Mathematics

## eTextbook

- Anytime, anywhere learning with the Pearson+ app
- Easy-to-use search, navigation and notebook
- Simpler studying with flashcards

- Hardcover, paperback or looseleaf edition
- Affordable rental option for select titles
- Free shipping on looseleafs and traditional textbooks

## MyLab

- Reach every student with personalized support
- Customize courses with ease
- Optimize learning with dynamic study tools

For courses in Introductory Statistics.

### Encourages statistical thinking using technology, innovative methods and humor

**Stats: Data and Models, 5th Edition** helps students think critically about data while maintaining the book's core concepts, coverage and unparalleled readability. The authors use technology and simulations to demonstrate variability at critical points throughout, making it easier for instructors to teach and for students to understand more complicated statistical concepts later in the course (such as the Central Limit Theorem). Students also get more exposure to large data sets and multivariate thinking, which better prepares them to be critical consumers of statistics today.

### Hallmark features of this title

**Where Are We Going? chapter openers**give context for the work students are about to begin within the broader course.**Reality Checks**ask students to think about whether their answers make sense before interpreting their results.**Notation Alerts**appear whenever special notation is introduced.- The
**Tech Support**section provides instructions for applying the topics covered by the chapter within each of the supported statistics packages. **Focused examples**are provided as each important concept is introduced, applying the concept usually with real, up-to-the-minute data.**Just Checking questions**are quick checks throughout the chapter that involve minimal calculation and encourage students to pause and think about what they've just read.

### New and updated features of this title

**Random Matters:**This new feature encourages a gradual, cumulative understanding of randomization.**Streamlined coverage of descriptive statistics**helps students progress more quickly through the first part of the book.**For 2 of the most difficult concepts in the introductory course, technology is utilized**to improve learning: the idea of a sampling distribution and the reasoning of statistical inference.**A third variable is introduced with contingency tables and mosaic plots in Chapter 3**to give students earlier experience with multivariable thinking. Then, following the discussion of correlation and regression as a tool (without inference) in Chapters 6, 7 and 8, multiple regression is introduced in Chapter 9.**Expanded and revised Think/Show/Tell Step-by-Step Examples**guide students through the process of analyzing a problem through worked examples.**New Web tools**provide interactive versions of the distribution tables at the back of the book, and tools for randomization inference methods such as the bootstrap and for repeated sampling from larger populations now can be found online.

### Features of MyLab Statistics for the 5th Edition

**StatCrunch**^{®}**Projects**provide opportunities for students to explore data beyond the classroom. In each project, students analyze a large data set in StatCrunch and answer corresponding, assignable questions for immediate feedback. StatCrunch Projects span the entire curriculum or focus on certain key concepts. Questions from each project can also be assigned individually.**MyLab Statistics exercises**are newly mapped to improve student learning outcomes. Homework reinforces and supports students' understanding of key statistics topics.**Updated Think/Show/Tell Step-by-Step Example videos**guide students through the process of analyzing a problem using the “Think, Show, and Tell” strategy from the textbook.- Simulation Applets use technology to help students learn and visualize a wide range of topics covered in introductory statistics.

**Learning Catalytics**is a student response tool that uses students' smartphones, tablets, or laptops to engage them in more interactive tasks and thinking. It helps to foster student engagement and peer-to-peer learning, generate class discussion, and guide lectures with real-time analytics. Now access pre-built exercises created by leading Pearson authors.

### I: EXPLORING AND UNDERSTANDING DATA

**1. Stats Starts Here**- 1.1 What Is Statistics?
- 1.2 Data
- 1.3 Variables
- 1.4 Models

**2. Displaying and Describing Data**- 2.1 Summarizing and Displaying a Categorical Variable
- 2.2 Displaying a Quantitative Variable
- 2.3 Shape
- 2.4 Center
- 2.5 Spread

**3. Relationships Between Categorical Variables–Contingency Tables**- 3.1 Contingency Tables
- 3.2 Conditional Distributions
- 3.3 Displaying Contingency Tables
- 3.4 Three Categorical Variables

**4. Understanding and Comparing Distributions**- 4.1 Displays for Comparing Groups
- 4.2 Outliers
- 4.3 Re-Expressing Data: A First Look

**5. The Standard Deviation as a Ruler and the Normal Model**- 5.1 Using the Standard Deviation to Standardize Values
- 5.2 Shifting and Scaling
- 5.3 Normal Models
- 5.4 Working with Normal Percentiles
- 5.5 Normal Probability Plots
- Review of Part I: Exploring and Understanding Data

### II. EXPLORING RELATIONSHIPS BETWEEN VARIABLES

**6. Scatterplots, Association, and Correlation**- 6.1 Scatterplots
- 6.2 Correlation
- 6.3 Warning: Correlation ≠ Causation
- 6.4 Straightening Scatterplots

**7. Linear Regression**- 7.1 Least Squares: The Line of “Best Fit”
- 7.2 The Linear Model
- 7.3 Finding the Least Squares Line
- 7.4 Regression to the Mean
- 7.5 Examining the Residuals
- 7.6
*R*^{2}: The Variation Accounted for by the Model - 7.7 Regression Assumptions and Conditions

**8. Regression Wisdom**- 8.1 Examining Residuals
- 8.2 Extrapolation: Reaching Beyond the Data
- 8.3 Outliers, Leverage, and Influence
- 8.4 Lurking Variables and Causation
- 8.5 Working with Summary Values
- 8.6 Straightening Scatterplots: The Three Goals
- 8.7 Finding a Good Re-Expression

**9. Multiple Regression**- 9.1 What Is Multiple Regression?
- 9.2 Interpreting Multiple Regression Coefficients
- 9.3 The Multiple Regression Model: Assumptions and Conditions
- 9.4 Partial Regression Plots
- 9.5 Indicator Variables
- Review of Part II: Exploring Relationships Between Variables

### III. GATHERING DATA

**10. Sample Surveys**- 10.1 The Three Big Ideas of Sampling
- 10.2 Populations and Parameters
- 10.3 Simple Random Samples
- 10.4 Other Sampling Designs
- 10.5 From the Population to the Sample: You Can't Always Get What You Want
- 10.6 The Valid Survey
- 10.7 Common Sampling Mistakes, or How to Sample Badly

**11. Experiments and Observational Studies**- 11.1 Observational Studies
- 11.2 Randomized, Comparative Experiments
- 11.3 The Four Principles of Experimental Design
- 11.4 Control Groups
- 11.5 Blocking
- 11.6 Confounding
- Review of Part III: Gathering Data

### IV. RANDOMNESS AND PROBABILITY

**12. From Randomness to Probability**- 12.1 Random Phenomena
- 12.2 Modeling Probability
- 12.3 Formal Probability

**13. Probability Rules!**- 13.1 The General Addition Rule
- 13.2 Conditional Probability and the General Multiplication Rule
- 13.3 Independence
- 13.4 Picturing Probability: Tables, Venn Diagrams, and Trees
- 13.5 Reversing the Conditioning and Bayes' Rule

**14. Random Variables**- 14.1 Center: The Expected Value
- 14.2 Spread: The Standard Deviation
- 14.3 Shifting and Combining Random Variables
- 14.4 Continuous Random Variables

**15. Probability Models**- 15.1 Bernoulli Trials
- 15.2 The Geometric Model
- 15.3 The Binomial Model
- 15.4 Approximating the Binomial with a Normal Model
- 15.5 The Continuity Correction
- 15.6 The Poisson Model
- 15.7 Other Continuous Random Variables: The Uniform and the Exponential
- Review of Part IV: Randomness and Probability

### V. INFERENCE FOR ONE PARAMETER

**16. Sampling Distribution Models and Confidence Intervals for Proportions**- 16.1 The Sampling Distribution Model for a Proportion
- 16.2 When Does the Normal Model Work? Assumptions and Conditions
- 16.3 A Confidence Interval for a Proportion
- 16.4 Interpreting Confidence Intervals: What Does 95% Confidence Really Mean?
- 16.5 Margin of Error: Certainty vs. Precision
- 16.6 Choosing the Sample Size

**17. Confidence Intervals for Means**- 17.1 The Central Limit Theorem
- 17.2 A Confidence Interval for the Mean
- 17.3 Interpreting Confidence Intervals
- 17.4 Picking Our Interval up by Our Bootstraps
- 17.5 Thoughts About Confidence Intervals

**18. Testing Hypotheses**- 18.1 Hypotheses
- 18.2 P-Values
- 18.3 The Reasoning of Hypothesis Testing
- 18.4 A Hypothesis Test for the Mean
- 18.5 Intervals and Tests
- 18.6 P-Values and Decisions: What to Tell About a Hypothesis Test

**19. More About Tests and Intervals**- 19.1 Interpreting P-Values
- 19.2 Alpha Levels and Critical Values
- 19.3 Practical vs. Statistical Significance
- 19.4 Errors
- Review of Part V: Inference for One Parameter

### VI. INFERENCE FOR RELATIONSHIPS

**20. Comparing Groups**- 20.1 A Confidence Interval for the Difference Between Two Proportions
- 20.2 Assumptions and Conditions for Comparing Proportions
- 20.3 The Two-Sample
*z*-Test: Testing for the Difference Between Proportions - 20.4 A Confidence Interval for the Difference Between Two Means
- 20.5 The Two-Sample
*t*-Test: Testing for the Difference Between Two Means - 20.6 Randomization Tests and Confidence Intervals for Two Means
- 20.7 Pooling
- 20.8 The Standard Deviation of a Difference

**21. Paired Samples and Blocks**- 21.1 Paired Data
- 21.2 The Paired
*t*-Test - 21.3 Confidence Intervals for Matched Pairs
- 21.4 Blocking

**22. Comparing Counts**- 22.1 Goodness-of-Fit Tests
- 22.2 Chi-Square Test of Homogeneity
- 22.3 Examining the Residuals
- 22.4 Chi-Square Test of Independence

**23. Inferences for Regression**- 23.1 The Regression Model
- 23.2 Assumptions and Conditions
- 23.3 Regression Inference and Intuition
- 23.4 The Regression Table
- 23.5 Multiple Regression Inference
- 23.6 Confidence and Prediction Intervals
- 23.7 Logistic Regression
- 23.8 More About Regression
- Review of Part VI: Inference for Relationships

### VII. INFERENCE WHEN VARIABLES ARE RELATED

**24. Multiple Regression Wisdom**- 24.1 Multiple Regression Inference
- 24.2 Comparing Multiple Regression Model
- 24.3 Indicators
- 24.4 Diagnosing Regression Models: Looking at the Cases
- 24.5 Building Multiple Regression Models

**25. Analysis of Variance**- 25.1 Testing Whether the Means of Several Groups Are Equal
- 25.2 The ANOVA Table
- 25.3 Assumptions and Conditions
- 25.4 Comparing Means
- 25.5 ANOVA on Observational Data

**26. Multifactor Analysis of Variance**- 26.1 A Two Factor ANOVA Model
- 26.2 Assumptions and Conditions
- 26.3 Interactions

**27. Statistics and Data Science**- 27.1 Introduction to Data Mining
- Review of Part VII: Inference When Variables Are Related

- Parts I - V Cumulative Review Exercises

### Appendices

- Answers
- Credits
- Indexes
- Tables and Selected Formulas

### About our authors

**Richard D. De Veaux **is an internationally known educator and consultant. He has taught at the Wharton School and the Princeton University School of Engineering, where he won a Lifetime Award for Dedication and Excellence in Teaching. He is the C. Carlisle and M. Tippit Professor of Statistics at Williams College, where he has taught since 1994. Dick has won both the Wilcoxon and Shewell awards from the American Society for Quality. He is a fellow of the American Statistical Association (ASA) and an elected member of the International Statistical Institute (ISI). In 2008, he was named Statistician of the Year by the Boston Chapter of the ASA, and was the 2018-2021 Vice-President of the ASA. Dick is also well known in industry, where for more than 30 years he has consulted for such Fortune 500 companies as American Express, Hewlett-Packard, Alcoa, DuPont, Pillsbury, General Electric, and Chemical Bank. Because he consulted with Mickey Hart on his book **Planet Drum**, he has also sometimes been called the "Official Statistician for the Grateful Dead." His real-world experiences and anecdotes illustrate many of this book's chapters.

Dick holds degrees from Princeton University in Civil Engineering (B.S.E.) and Mathematics (A.B.) and from Stanford University in Dance Education (M.A.) and Statistics (Ph.D.), where he studied dance with Inga Weiss and Statistics with Persi Diaconis. His research focuses on the analysis of large data sets and data mining in science and industry.

In his spare time, he is an avid cyclist and swimmer. He also is the founder of the "Diminished Faculty," an a cappella Doo-Wop quartet at Williams College, and sings bass in the college concert choir and with the Choeur Vittoria of Paris. Dick is the father of 4 children.

**Paul F. Velleman **has an international reputation for innovative Statistics education. He is the author and designer of the multimedia Statistics program ActivStats, for which he was awarded the EDUCOM Medal for innovative uses of computers in teaching statistics, and the ICTCM Award for Innovation in Using Technology in College Mathematics. He also developed the award-winning statistics program Data Desk, the Internet site Data and Story Library (DASL) which provides data sets for teaching Statistics, and the tools referenced in the text for simulation and bootstrapping. Paul's understanding of using and teaching with technology informs much of this book's approach.

Paul taught Statistics at Cornell University, where he was awarded the MacIntyre Award for Exemplary Teaching. He is Emeritus Professor of Statistical Science from Cornell and lives in Maine with his wife, Sue Michlovitz. He holds an A.B. from Dartmouth College in Mathematics and Social Science, and M.S. and Ph.D. degrees in Statistics from Princeton University, where he studied with John Tukey. His research often deals with statistical graphics and data analysis methods. Paul co-authored (with David Hoaglin) **ABCs of Exploratory Data Analysis**. Paul is a Fellow of the American Statistical Association and of the American Association for the Advancement of Science. Paul is the father of 2 boys. In his spare time he sings with the acapella group VoXX and studies tai chi.

**David E. Bock** taught mathematics at Ithaca High School for 35 years. He has taught Statistics at Ithaca High School, Tompkins-Cortland Community College, Ithaca College, and Cornell University. Dave has won numerous teaching awards, including the MAA's Edyth May Sliffe Award for Distinguished High School Mathematics Teaching (2 times), Cornell University's Outstanding Educator Award (3 times), and has been a finalist for New York State Teacher of the Year.

Dave holds degrees from the University at Albany in Mathematics (B.A.) and Statistics/Education (M.S.). Dave has been a reader and table leader for the AP Statistics exam and a Statistics consultant to the College Board, leading workshops and institutes for AP Statistics teachers. His understanding of how students learn informs much of this book's approach.

Need help? Get in touch