Machine Learning Fundamentals with Kubeflow

Name: Machine Learning Fundamentals with Kubeflow
Author: Trevor Grant

From Lab to Production

Paperback Engels 2020 1e druk 9781492050124

Niet leverbaar.

Samenvatting

If you're training a machine learning model but aren't sure how to put it into production, this book will get you there. Kubeflow provides a collection of cloud native tools for different stages of a model's lifecycle, from data exploration, feature preparation, and model training to model serving. This guide helps data scientists build production-grade machine learning implementations with Kubeflow and shows data engineers how to make models scalable and reliable.

Using examples throughout the book, authors Holden Karau, Trevor Grant, Ilan Filonenko, Richard Liu, and Boris Lublinsky explain how to use Kubeflow to train and serve your machine learning models on top of Kubernetes in the cloud or in a development environment on-premises.

- Understand Kubeflow's design, core components, and the problems it solves
- Understand the differences between Kubeflow on different cluster types
- Train models using Kubeflow with popular tools including Scikit-learn, TensorFlow, and Apache Spark
- Keep your model up to date with Kubeflow Pipelines
- Understand how to capture model training metadata
- Explore how to extend Kubeflow with additional open source tools
- Use hyperparameter tuning for training
- Learn how to serve your model in production

Specificaties

ISBN13:9781492050124

Trefwoorden:Programmeren, machine learning, Kubeflow

Taal:Engels

Bindwijze:paperback

Aantal pagina's:241

Uitgever:O'Reilly

Druk:1

Verschijningsdatum:30-11-2020

Hoofdrubriek:IT-management / ICT

Lezersrecensies

Wees de eerste die een lezersrecensie schrijft!

Schrijf een recensie

Uw waardering

?

Log in om uw waardering te geven

Klik om uw waardering te geven

Inhoudsopgave

Foreword
Preface
Our Assumption About You
Your Responsibility as a Practitioner
Conventions Used in This Book
Code Examples
Using Code Examples
O’Reilly Online Learning
How to Contact the Authors
How to Contact Us
Acknowledgments
Grievances

1. Kubeflow: What It Is and Who It Is For
Model Development Life Cycle
Where Does Kubeflow Fit In?
Why Containerize?
Why Kubernetes?
Kubeflow’s Design and Core Components
Data Exploration with Notebooks
Data/Feature Preparation
Training
Hyperparameter Tuning
Model Validation
Inference/Prediction
Pipelines
Component Overview
Alternatives to Kubeflow
Clipper (RiseLabs)
MLflow (Databricks)
Others
Introducing Our Case Studies
Modified National Institute of Standards and Technology
Mailing List Data
Product Recommender
CT Scans
Conclusion

2. Hello Kubeflow
Getting Set Up with Kubeflow
Installing Kubeflow and Its Dependencies
Setting Up Local Kubernetes
Setting Up Your Kubeflow Development Environment
Creating Our First Kubeflow Project
Training and Deploying a Model
Training and Monitoring Progress
Test Query
Going Beyond a Local Deployment
Conclusion

3. Kubeflow Design: Beyond the Basics
Getting Around the Central Dashboard
Notebooks (JupyterHub)
Training Operators
Kubeflow Pipelines
Hyperparameter Tuning
Model Inference
Metadata
Component Summary
Support Components
MinIO
Istio
Knative
Apache Spark
Kubeflow Multiuser Isolation
Conclusion

4. Kubeflow Pipelines
Getting Started with Pipelines
Exploring the Prepackaged Sample Pipelines
Building a Simple Pipeline in Python
Storing Data Between Steps
Introduction to Kubeflow Pipelines Components
Argo: the Foundation of Pipelines
What Kubeflow Pipelines Adds to Argo Workflow
Building a Pipeline Using Existing Images
Kubeflow Pipeline Components
Advanced Topics in Pipelines
Conditional Execution of Pipeline Stages
Running Pipelines on Schedule
Conclusion

5. Data and Feature Preparation
Deciding on the Correct Tooling
Local Data and Feature Preparation
Fetching the Data
Data Cleaning: Filtering Out the Junk
Formatting the Data
Feature Preparation
Custom Containers
Distributed Tooling
TensorFlow Extended
Distributed Data Using Apache Spark
Distributed Feature Preparation Using Apache Spark
Putting It Together in a Pipeline
Using an Entire Notebook as a Data Preparation Pipeline Stage
Conclusion

6. Artifact and Metadata Store
Kubeflow ML Metadata
Programmatic Query
Kubeflow Metadata UI
Using MLflow’s Metadata Tools with Kubeflow
Creating and Deploying an MLflow Tracking Server
Logging Data on Runs
Using the MLflow UI
Conclusion

7. Training a Machine Learning Model
Building a Recommender with TensorFlow
Getting Started
Starting a New Notebook Session
TensorFlow Training
Deploying a TensorFlow Training Job
Distributed Training
Using GPUs
Using Other Frameworks for Distributed Training
Training a Model Using Scikit-Learn
Starting a New Notebook Session
Data Preparation
Scikit-Learn Training
Explaining the Model
Exporting Model
Integration into Pipelines
Conclusion

8. Model Inference
Model Serving
Model Serving Requirements
Model Monitoring
Model Accuracy, Drift, and Explainability
Model Monitoring Requirements
Model Updating
Model Updating Requirements
Summary of Inference Requirements
Model Inference in Kubeflow
TensorFlow Serving
Review
Seldon Core
Designing a Seldon Inference Graph
Testing Your Model
Serving Requests
Monitoring Your Models
Review
KFServing
Serverless and the Service Plane
Data Plane
Example Walkthrough
Peeling Back the Underlying Infrastructure
Review
Conclusion

9. Case Study Using Multiple Tools
The Denoising CT Scans Example
Data Prep with Python
DS-SVD with Apache Spark
Visualization
The CT Scan Denoising Pipeline
Sharing the Pipeline
Conclusion

10. Hyperparameter Tuning and Automated Machine Learning
AutoML: An Overview
Hyperparameter Tuning with Kubeflow Katib
Katib Concepts
Installing Katib
Running Your First Katib Experiment
Prepping Your Training Code
Configuring an Experiment
Running the Experiment
Katib User Interface
Tuning Distributed Training Jobs
Neural Architecture Search
Advantages of Katib over Other Frameworks
Conclusion

A: Argo Executor Configurations and Trade-Offs
B: Cloud-Specific Tools and Configuration
Google Cloud
TPU-Accelerated Instances
Dataflow for TFX
C: Using Model Serving in Applications
Building Streaming Applications Leveraging Model Serving
Stream Processing Engines and Libraries
Introducing Cloudflow
Building Batch Applications Leveraging Model Serving

Index