Pandas for Everyone: Python Data Analysis, 2nd edition

Daniel Y. Chen

Products list

VitalSource eTextbook

Pandas for Everyone: Python Data Analysis
ISBN-13: 9780137891054 | Published 2022

$66.95

$66.95 AUD

Instant access

Products list

Paperback

Pandas for Everyone: Python Data Analysis
ISBN-13: 9780137891153 | Published 2022

$64.00

$66.95 AUD

Instant access

Title overview

Manage and automate data analysis with pandas in python

Today, analysts must manage data characterized by extraordinary variety, velocity, and volume. Using the open source Pandas library, you can use Python to rapidly automate and perform virtually any data analysis task, no matter how large or complex. Pandas can help you ensure the veracity of your data, visualise it for effective decision-making, and reliably reproduce analyses across multiple data sets.

Pandas for Everyone, 2nd Edition, brings together practical knowledge and insight for solving real problems with Pandas, even if you're new to Python data analysis. Daniel Y. Chen introduces key concepts through simple but practical examples, incrementally building on them to solve more difficult, real-world data science problems such as using regularisation to prevent data overfitting, or when to use unsupervised machine learning methods to find the underlying structure in a data set.

New features to the second edition include:

Extended coverage of plotting and the seaborn data visualisation library
Expanded examples and resources
Updated Python 3.9 code and packages coverage, including stats models and scikit-learn libraries
Online bonus material on geopandas, Dask, and creating interactive graphics with Altair

Chen gives you a jumpstart on using Pandas with a realistic data set and covers combining data sets, handling missing data, and structuring data sets for easier analysis and visualisation. He demonstrates powerful data cleaning techniques, from basic string manipulation to applying functions simultaneously across data frames.

Once your data is ready, Chen guides you through fitting models for prediction, clustering, inference, and exploration. He provides tips on performance and scalability and introduces you to the wider Python data analysis ecosystem.

Work with Data Frames and Series, and import or export data
Create plots with matplotlib, seaborn, and pandas
Combine data sets and handle missing data
Reshape, tidy, and clean data sets so they're easier to work with
Convert data types and manipulate text strings
Apply functions to scale data manipulations
Aggregate, transform, and filter large data sets with group by
Leverage Pandas' advanced date and time capabilities
Fit linear models using stats models and scikit-learn libraries
Use generalised linear modeling to fit models with different response variables
Compare multiple models to select the 'best' one
Regularise to overcome overfitting and improve performance
Use clustering in unsupervised machine learning

Part I: Introduction
Chapter 1. Pandas DataFrame Basics
Chapter 2. Pandas Data Structures Basics
Chapter 3. Plotting Basics
Chapter 4. Tidy Data
Chapter 5. Apply Functions
Part II: Data Processing
Chapter 6. Data Assembly
Chapter 7. Data Normalization
Chapter 8. Group by Operations: Split-Apply-Combine
Part III: Data Types
Chapter 9. Missing Data
Chapter 10. Data Types
Chapter 11. Strings and Text Data
Chapter 12. Dates and Times
Part IV: Data Modeling
Chapter 13. Linear Regression (Continuous Outcome Variable)
Chapter 14. Generalized Linear Models
Chapter 15. Survival Analysis
Chapter 16. Model Diagnostics
Chapter 17. Regularization
Chapter 18. Clustering
Part V. Conclusion
Chapter 19. Life Outside of Pandas
Chapter 20. It's Dangerous To Go Alone!

Pandas for Everyone: Python Data Analysis, 2nd edition

eTextbook

Print

Access details

Features

Access details

Features

Title overview

Manage and automate data analysis with pandas in python

Table of contents

Need help?Get in touch

Shop

Digital learning

Subject catalogue

Get support

Support

Digital learning

Resources by discipline

Get support

Support

Get inspired

Key series

Shop by category

Support

Support

Support

Subject catalogue

Shop for university

Digital learning

Get support

Resources by discipline

Digital learning platforms

Get support

Shop by category

Get inspired

Key series

Pandas for Everyone: Python Data Analysis, 2nd edition

eTextbook

Print

Access details

Features

Access details

Features

Title overview

Manage and automate data analysis with pandas in python

Table of contents

Need help?Get in touch

Shopping

Explore

Help & support

Pearson