Machine Learning Foundations, 1st edition

Published by Addison-Wesley Professional (February 20, 2026) © 2026

  • Roi Yehoshua
Products list

Currently unavailable

Products list

Currently unavailable

Title overview

The Essential Guide to Machine Learning in the Age of AI

Machine learning stands at the heart of today's most transformative technologies: advancing scientific discovery, reshaping industries, and transforming everyday life. From large language models to medical diagnosis and autonomous vehicles, the demand for robust, principled machine learning models has never been greater.

Machine Learning Foundations, Volume 1: Supervised Learning, offers a comprehensive and accessible roadmap to the core algorithms and concepts behind modern AI systems. Balancing mathematical rigor with hands-on implementation, this book not only teaches how machine learning works, but why it works. As part of a three-volume series, Volume 1 lays the foundation for mastering the full landscape of modern machine learning, including deep learning, large language models, and cutting-edge research.

Each chapter introduces core ideas with clear intuition, supports them with rigorous mathematical derivations where appropriate, and demonstrates how to implement the methods in Python, while also addressing practical considerations such as data preparation and hyperparameter tuning. Exercises at the end of each chapter, both theoretical and programming-based, reinforce understanding and promote active learning.

The book includes hundreds of fully annotated code examples, available on GitHub at github.com/roiyeho/ml-book, along with six comprehensive online appendices covering essential background in linear algebra, calculus, probability, statistics, optimization, and Python libraries such as NumPy, Pandas, and Matplotlib.

  • Master the key concepts of supervised machine learning, including model capacity, the bias-variance tradeoff, generalization, and optimization techniques
  • Implement the full supervised learning pipeline, from data preprocessing and feature engineering to model selection, training, and evaluation
  • Understand key learning tasks, including classification, regression, multi-label, and multi-output problems
  • Implement foundational algorithms from scratch, including linear and logistic regression, decision trees, gradient boosting, and SVMs
  • Gain hands-on experience with industry-standard tools such as Scikit-Learn, XGBoost, and NLTK
  • Refine and optimize your models using techniques such as hyperparameter tuning, cross-validation, and calibration
  • Work with diverse data types, including tabular data, text, and images
  • Address real-world challenges such as imbalanced datasets, missing data, and high-dimensional inputs

Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.

Table of contents

Preface xv
About the Author xxvii

Chapter 1: Introduction to Machine Learning 1
1.1 Formal Definition 1
1.2 Types of Machine Learning 3
1.3 Related Fields 4
1.4 Brief History 5
1.5 Machine Learning Applications 8
1.6 Limitations of Machine Learning 9
1.7 Ethical Considerations 10
1.8 Software Libraries 11
1.9 Common Datasets 12
1.10 Current Trends and Future Directions 14
1.11 Summary 14

Chapter 2: Supervised Machine Learning 15
2.1 Formal Definition 16
2.2 Machine Learning Models 19
2.3 The Data-Generating Process 24
2.4 Generalization Error and Empirical Risk Minimization 28
2.5 Parameter Estimation 29
2.6 The Bias–Variance Tradeoff 31
2.7 Building a Machine Learning Model 36
2.8 Challenges in Supervised Learning 39
2.9 Summary 40
2.10 Exercises 41

Chapter 3: Introduction to Scikit-Learn 49
3.1 Main Features 50
3.2 Installation 50
3.3 The Estimator API 51
3.4 Typical Workflow of Building a Model 56
3.5 Example: Iris Classification 56
3.6 Pipelines 70
3.7 Cross-Validation 73
3.8 Hyperparameter Tuning 76
3.9 Data Preparation 83
3.10 Data Preprocessing 85
3.11 Building Your Own Estimators 106
3.12 Summary 109
3.13 Exercises 110

Chapter 4: Linear Regression 117
4.1 Formal Definitions and Notations 118
4.2 Ordinary Least Squares (OLS) 118
4.3 Simple Linear Regression 120
4.4 Multiple Linear Regression 126
4.5 Regression Evaluation Metrics 133
4.6 Example: Predicting Housing Prices 137
4.7 Linear Regression Assumptions 145
4.8 Gradient Descent for Linear Regression 151
4.9 Alternative Loss Functions 160
4.10 Nonlinear Regression 162
4.11 Regularized Linear Regression 174
4.12 (*) Bayesian Linear Regression 185
4.13 Summary 189
4.14 Exercises 190

Chapter 5: Logistic Regression 201
5.1 Classification Problems 202
5.2 The Logistic Regression Model 202
5.3 Training a Logistic Regression Model 208
5.4 Logistic Regression in Scikit-Learn 221
5.5 Classification Evaluation Metrics 223
5.6 Learning from Imbalanced Data 238
5.7 Multi-Class Extensions for Binary Classifiers 246
5.8 Multinomial Logistic Regression 247
5.9 (*) Generalized Linear Models 262
5.10 Summary 265
5.11 Exercises 266

Chapter 6: K-Nearest Neighbors 281
6.1 K-Nearest Neighbors Classification 282
6.2 Choosing the Number of Neighbors 289
6.3 Distance Metrics and Similarity Measures 292
6.4 (*) KNN Classification Error Bounds 300
6.5 (*) Efficient Data Structures for KNN 302
6.6 The Curse of Dimensionality 307
6.7 Radius-based Nearest Neighbors 312
6.8 K-Nearest Neighbors Regression 314
6.9 Variants of KNN 316
6.10 Applications of KNN 318
6.11 Summary 319
6.12 Exercises 320

Chapter 7: Naive Bayes 329
7.1 The Naive Bayes Model 330
7.2 Event Models in Naive Bayes 332
7.3 Naive Bayes Classifiers in Scikit-Learn 344
7.4 (*) Linear and Quadratic Discriminant Analysis 345
7.5 Introduction to Natural Language Processing 349
7.6 Document Classification Example 364
7.7 Bayesian Networks 379
7.8 (*) Probabilistic Graphical Models 384
7.9 Summary 385
7.10 Exercises 386

Chapter 8: Decision Trees 403
8.1 Decision Tree Definition 403
8.2 Decision Tree Construction 404
8.3 Tree Pruning 419
8.4 Decision Tree Algorithms 424
8.5 Decision Trees in Scikit-Learn 426
8.6 Regression Trees 434
8.7 Multi-Output Problems 439
8.8 Oblique Decision Trees 442
8.9 Summary 445
8.10 Exercises 447

Chapter 9: Ensemble Methods 461
9.1 Introduction and Motivation 462
9.2 Types of Ensemble Methods 465
9.3 Using Different Learning Algorithms 466
9.4 Bagging Ensembles 471
9.5 Random Forests 479
9.6 Boosting Ensembles 489
9.7 AdaBoost 490
9.8 Gradient Boosting 505
9.9 Stacking Ensembles 538
9.10 Summary 543
9.11 Exercises 544

Chapter 10: Gradient Boosting Libraries 563
10.1 XGBoost 563
10.2 LightGBM 608
10.3 CatBoost 612
10.4 The Higgs Boson Machine Learning Challenge 624
10.5 Summary 630
10.6 Exercises 630

Chapter 11: Support Vector Machines 639
11.1 Hard-Margin SVM 640
11.2 Soft-Margin SVM 659
11.3 Kernel Methods 667
11.4 Nonlinear SVM 677
11.5 ν-SVM 685
11.6 Support Vector Regression (SVR) 687
11.7 (*) Efficient Methods for Training SVMs 693
11.8 Model Calibration 696
11.9 Summary 705
11.10 Exercises 707

Chapter 12: Summary and Additional Resources 727
12.1 Supervised Learning Summary 727
12.2 Choosing a Learning Algorithm 729
12.3 Conducting Machine Learning Experiments 731
12.4 Research in Machine Learning 747
12.5 Machine Learning Competitions 754
12.6 Additional Resources 761
12.7 Looking Ahead to Volume II 767
12.8 Exercises 768

Appendix A: Linear Algebra (available online, see Preface)
Appendix B: Calculus (available online, see Preface)
Appendix C: Probability Theory (available online, see Preface)
Appendix D: Statistics (available online, see Preface)
Appendix E: Optimization (available online, see Preface)
Appendix F: Python Libraries (available online, see Preface)

Bibliography 783
Index 823

Need help?Get in touch