Scaling Python with Ray
Adventures in Cloud and Serverless Patterns
Paperback Engels 2022 1e druk 9781098118808Samenvatting
Serverless computing enables developers to concentrate solely on their applications rather than worry about where they've been deployed. With the Ray general-purpose serverless implementation in Python, programmers and data scientists can hide servers, implement stateful applications, support direct communication between tasks, and access hardware accelerators.
In this book, experienced software architecture practitioners Holden Karau and Boris Lublinsky show you how to scale existing Python applications and pipelines, allowing you to stay in the Python ecosystem while reducing single points of failure and manual scheduling. Scaling Python with Ray is ideal for software architects and developers eager to explore successful case studies and learn more about decision and measurement effectiveness.
If your data processing or server application has grown beyond what a single computer can handle, this book is for you. You'll explore distributed processing (the pure Python implementation of serverless) and learn how to:
- Implement stateful applications with Ray actors
- Build workflow management in Ray
- Use Ray as a unified system for batch and stream processing
- Apply advanced data processing with Ray
- Build microservices with Ray
- Implement reliable Ray applications
Specificaties
Lezersrecensies
Inhoudsopgave
Preface
What You Will Learn
A Note on Responsibility
Conventions Used in This Book
License
Using Code Examples
O'Reilly Online Learning
How to Contact Us
Acknowledgments
From Holden
From Boris
1. What Is Ray, and Where Does It Fit?
Why Do You Need Ray?
Where Can You Run Ray?
Running Your Code with Ray
Where Does It Fit in the Ecosystem?
Big Data / Scalable DataFrames
Machine Learning
Workflow Scheduling
Streaming
Interactive
What Ray Is Not
Conclusion
2. Getting Started with Ray (Locally)
Installation
Installing for x86 and M1 ARM
Installing (from Source) for ARM
Hello Worlds
Ray Remote (Task/Futures) Hello World
Data Hello World
Actor Hello World
Conclusion
3. Remote Functions
Essentials of Ray Remote Functions
Composition of Remote Ray Functions
Ray Remote Best Practices
Bringing It Together with an Example
Conclusion
4. Remote Actors
Understanding the Actor Model
Creating a Basic Ray Remote Actor
Implementing the Actorâs Persistence
Scaling Ray Remote Actors
Ray Remote Actors Best Practices
Conclusion
5. Ray Design Details
Fault Tolerance
Ray Objects
Serialization/Pickling
cloudpickle
Apache Arrow
Resources / Vertical Scaling
Autoscaler
Placement Groups: Organizing Your Tasks and Actors
Namespaces
Managing Dependencies with Runtime Environments
Deploying Ray Applications with the Ray Job API
Conclusion
6. Implementing Streaming Applications
Apache Kafka
Basic Kafka Concepts
Kafka APIs
Using Kafka with Ray
Scaling Our Implementation
Building Stream-Processing Applications with Ray
Key-Based Approach
Key-Independent Approach
Going Beyond Kafka
Conclusion
7. Implementing Microservices
Understanding Microservice Architecture in Ray
Deployment
Additional Deployment Capabilities
Deployment Composition
Using Ray Serve for Model Serving
Simple Model Service Example
Considerations for Model-Serving Implementations
Speculative Model Serving Using the Ray Microservice Framework
Conclusion
8. Ray Workflows
What Is Ray Workflows?
How Is It Different from Other Solutions?
Ray Workflows Features
What Are the Main Features?
Workflow Primitives
Working with Basic Workflow Concepts
Workflows, Steps, and Objects
Dynamic Workflows
Virtual Actors
Workflows in Real Life
Building Workflows
Managing Workflows
Building a Dynamic Workflow
Building Workflows with Conditional Steps
Handling Exceptions
Handling Durability Guarantees
Extending Dynamic Workflows with Virtual Actors
Integrating Workflows with Other Ray Primitives
Triggering Workflows (Connecting to Events)
Working with Workflow Metadata
Conclusion
9. Advanced Data with Ray
Creating and Saving Ray Datasets
Using Ray Datasets with Different Tools
Using Tools on Ray Datasets
pandas-like DataFrames with Dask
Indexing
Shuffles
Embarrassingly Parallel Operations
Working with Multiple DataFrames
What Does Not Work
Whatâs Slower
Handling Recursive Algorithms
What Other Functions Are Different
pandas-like DataFrames with Modin
Big Data with Spark
Working with Local Tools
Using Built-in Ray Dataset Operations
Implementing Ray Datasets
Conclusion
10. How Ray Powers Machine Learning
Using scikit-learn with Ray
Using Boosting Algorithms with Ray
Using XGBoost
Using LightGBM
Using PyTorch with Ray
Reinforcement Learning with Ray
Hyperparameter Tuning with Ray
Conclusion
11. Using GPUs and Accelerators with Ray
What Are GPUs Good At?
The Building Blocks
Higher-Level Libraries
Acquiring and Releasing GPU and Accelerator Resources
Ray's ML Libraries
Autoscaler with GPUs and Accelerators
CPU Fallback as a Design Pattern
Other (Non-GPU) Accelerators
Conclusion
12. Ray in the Enterprise
Ray Dependency Security Issues
Interacting with the Existing Tools
Using Ray with CI/CD Tools
Authentication with Ray
Multitenancy on Ray
Credentials for Data Sources
Permanent Versus Ephemeral Clusters
Ephemeral Clusters
Permanent Clusters
Monitoring
Instrumenting Your Code with Ray Metrics
Wrapping Custom Programs with Ray
Conclusion
A. Space Beaver Case Study: Actors, Kubernetes, and More
High-Level Design
Implementation
Outbound Mail Client
Shared Actor Patterns and Utilities
Mail Server Actor
Satellite Actor
User Actor
SMS Actor and Serve Implementation
Testing
Deployment
Conclusion
B. Installing and Deploying Ray
Installing Ray Locally
Using Ray Docker Images
Using Ray Clusters
Installing Ray on AWS
Installing Ray on IBM Cloud
Installing Ray on Kubernetes
Installing Ray on a kind Cluster
Using ray up
Using the Ray Kubernetes Operator
Installing Ray on OpenShift
Conclusion
C. Debugging with Ray
General Debugging Tips with Ray
Serialization Errors
Local Debugging with Ray Local
Remote Debugging
Ray's Integrated Debugger (via Pdb)
Other Tools
Ray and Container Exit Codes
Ray Logs
Container Errors
Native Errors
Conclusion
Index
About the Authors
Rubrieken
- advisering
- algemeen management
- coaching en trainen
- communicatie en media
- economie
- financieel management
- inkoop en logistiek
- internet en social media
- it-management / ict
- juridisch
- leiderschap
- marketing
- mens en maatschappij
- non-profit
- ondernemen
- organisatiekunde
- personal finance
- personeelsmanagement
- persoonlijke effectiviteit
- projectmanagement
- psychologie
- reclame en verkoop
- strategisch management
- verandermanagement
- werk en loopbaan