Piotr Lewandowski is a Senior Staff Site Reliability Engineer, and has spent the past nine years improving the security posture of Google’s infrastructure.

Meer over de auteurs

Piotr Lewandowski, Adam Stubblefield, Ana Oprea, Betsy Beyer, Paul Blankinship, Heather Adkins e.a.

Building Secure and Reliable Systems

Name: Building Secure and Reliable Systems
Author: Piotr Lewandowski

Best Practices for Designing, Implementing, and Maintaining Systems

Paperback Engels 2020 1e druk 9781492083122

€ 90,93

In winkelwagen

Levertijd ongeveer 16 werkdagen

Gratis verzonden

Samenvatting

Can a system be considered truly reliable if it isn't fundamentally secure? Or can it be considered secure if it's unreliable? Security is crucial to the design and operation of scalable systems in production, as it plays an important part in product quality, performance, and availability. In this book, experts from Google share best practices to help your organization design scalable and reliable systems that are fundamentally secure.

Two previous O’Reilly books from Google—Site Reliability Engineering and The Site Reliability Workbook—demonstrated how and why a commitment to the entire service lifecycle enables organizations to successfully build, deploy, monitor, and maintain software systems. In this latest guide, the authors offer insights into system design, implementation, and maintenance from practitioners who specialize in security and reliability. They also discuss how building and adopting their recommended best practices requires a culture that’s supportive of such change.

You’ll learn about secure and reliable systems through:
Design strategies Recommendations for coding, testing, and debugging practices Strategies to prepare for, respond to, and recover from incidents Cultural best practices that help teams across your organization collaborate effectively

Specificaties

ISBN13:9781492083122

Trefwoorden:Programmeren, Systems Engineering, veilige systemen

Taal:Engels

Bindwijze:paperback

Aantal pagina's:600

Uitgever:O'Reilly

Druk:1

Verschijningsdatum:27-3-2020

Hoofdrubriek:IT-management / ICT

Lezersrecensies

Wees de eerste die een lezersrecensie schrijft!

Schrijf een recensie

Uw waardering

?

Log in om uw waardering te geven

Klik om uw waardering te geven

Over Piotr Lewandowski

Piotr Lewandowski is a Senior Staff Site Reliability Engineer, and has spent the past nine years improving the security posture of Google’s infrastructure. As the Production Tech Lead for Security, he is responsible for harmonious collaboration between the SRE and security organizations. In his previous role, he led a team responsible for the reliability of Google’s critical security infrastructure. Before joining Google, he built a startup, worked at CERT Polska, and got a degree in computer science from Warsaw University of Technology.

Andere boeken door Piotr Lewandowski

Bekijk alle boeken

Over Adam Stubblefield

Adam Stubblefield is a Distinguished Engineer and the Horizontal Lead for Security at Google. Over the past 8 years, he’s led teams that have built much of Google’s core security infrastructure. Adam has a PhD in Computer Science from Johns Hopkins.

Andere boeken door Adam Stubblefield

Bekijk alle boeken

Over Paul Blankinship

Paul Blankinship manages the Technical Writing team for Google’s Security and Privacy Engineering group. He’s previously written documentation for Google Web Designer, and helped develop Google’s internal security and privacy policies.

Andere boeken door Paul Blankinship

Bekijk alle boeken

Over Heather Adkins

Heather Adkins is a 17-year Google veteran and founding member of the Google Security Team. As Sr Director of Information Security, she has built a global team responsible for maintaining the safety and security of Google’s networks, systems and applications. She has an extensive background in systems and network administration with an emphasis on practical security, and has worked to build and secure some of the world’s largest infrastructure. She now focuses her time primarily on the defense of Google’s computing infrastructure and working with industry to tackle some of the greatest security challenges.

Andere boeken door Heather Adkins

Bekijk alle boeken

Inhoudsopgave

Foreword by Royal Hansen
Foreword by Michael Wildpaner
Preface
Why We Wrote This Book
Who This Book Is For
A Note About Culture
How to Read This Book
Conventions Used in This Book
O’Reilly Online Learning
How to Contact Us
Acknowledgments

I: Introductory Material
1. The Intersection of Security and Reliability
On Passwords and Power Drills
Reliability Versus Security: Design Considerations
Confidentiality, Integrity, Availability
Confidentiality
Integrity
Availability
Reliability and Security: Commonalities
Invisibility
Assessment
Simplicity
Evolution
Resilience
From Design to Production
Investigating Systems and Logging
Crisis Response
Recovery
Conclusion

2. Understanding Adversaries
Attacker Motivations
Attacker Profiles
Hobbyists
Vulnerability Researchers
Governments and Law Enforcement
Activists
Criminal Actors
Automation and Artificial Intelligence
Insiders
Attacker Methods
Threat Intelligence
Cyber Kill Chains™
Tactics, Techniques, and Procedures
Risk Assessment Considerations
Conclusion

II: Designing Systems
3. Case Study: Safe Proxies
Safe Proxies in Production Environments
Google Tool Proxy
Conclusion

4. Design Tradeoffs
Design Objectives and Requirements
Feature Requirements
Nonfunctional Requirements
Features Versus Emergent Properties
Example: Google Design Document
Balancing Requirements
Example: Payment Processing
Managing Tensions and Aligning Goals
Example: Microservices and the Google Web Application Framework
Aligning Emergent-Property Requirements
Initial Velocity Versus Sustained Velocity
Conclusion

5. Design for Least Privilege
Concepts and Terminology
Least Privilege
Zero Trust Networking
Zero Touch
Classifying Access Based on Risk
Best Practices
Small Functional APIs
Breakglass
Auditing
Testing and Least Privilege
Diagnosing Access Denials
Graceful Failure and Breakglass Mechanisms
Worked Example: Configuration Distribution
POSIX API via OpenSSH
Software Update API
Custom OpenSSH ForceCommand
Custom HTTP Receiver (Sidecar)
Custom HTTP Receiver (In-Process)
Tradeoffs
A Policy Framework for Authentication and Authorization Decisions
Using Advanced Authorization Controls
Investing in a Widely Used Authorization Framework
Avoiding Potential Pitfalls
Advanced Controls
Multi-Party Authorization (MPA)
Three-Factor Authorization (3FA)
Business Justifications
Temporary Access
Proxies
Tradeoffs and Tensions
Increased Security Complexity
Impact on Collaboration and Company Culture
Quality Data and Systems That Impact Security
Impact on User Productivity
Impact on Developer Complexity
Conclusion

6. Design for Understandability
Why Is Understandability Important?
System Invariants
Analyzing Invariants
Mental Models
Designing Understandable Systems
Complexity Versus Understandability
Breaking Down Complexity
Centralized Responsibility for Security and Reliability Requirements
System Architecture
Understandable Interface Specifications
Understandable Identities, Authentication, and Access Control
Security Boundaries
Software Design
Using Application Frameworks for Service-Wide Requirements
Understanding Complex Data Flows
Considering API Usability
Conclusion

7. Design for a Changing Landscape
Types of Security Changes
Designing Your Change
Architecture Decisions to Make Changes Easier
Keep Dependencies Up to Date and Rebuild Frequently
Release Frequently Using Automated Testing
Use Containers
Use Microservices
Different Changes: Different Speeds, Different Timelines
Short-Term Change: Zero-Day Vulnerability
Medium-Term Change: Improvement to Security Posture
Long-Term Change: External Demand
Complications: When Plans Change
Example: Growing Scope—Heartbleed
Conclusion

8. Design for Resilience
Design Principles for Resilience
Defense in Depth
The Trojan Horse
Google App Engine Analysis
Controlling Degradation
Differentiate Costs of Failures
Deploy Response Mechanisms
Automate Responsibly
Controlling the Blast Radius
Role Separation
Location Separation
Time Separation
Failure Domains and Redundancies
Failure Domains
Component Types
Controlling Redundancies
Continuous Validation
Validation Focus Areas
Validation in Practice
Practical Advice: Where to Begin
Conclusion

9. Design for Recovery
What Are We Recovering From?
Random Errors
Accidental Errors
Software Errors
Malicious Actions
Design Principles for Recovery
Design to Go as Quickly as Possible (Guarded by Policy)
Limit Your Dependencies on External Notions of Time
Rollbacks Represent a Tradeoff Between Security and Reliability
Use an Explicit Revocation Mechanism
Know Your Intended State, Down to the Bytes
Design for Testing and Continuous Validation
Emergency Access
Access Controls
Communications
Responder Habits
Unexpected Benefits
Conclusion

10. Mitigating Denial-of-Service Attacks
Strategies for Attack and Defense
Attacker’s Strategy
Defender’s Strategy
Designing for Defense
Defendable Architecture
Defendable Services
Mitigating Attacks
Monitoring and Alerting
Graceful Degradation
A DoS Mitigation System
Strategic Response
Dealing with Self-Inflicted Attacks
User Behavior
Client Retry Behavior
Conclusion

III: Implementing Systems
11. Case Study: Designing, Implementing, and Maintaining a Publicly Trusted CA
Background on Publicly Trusted Certificate Authorities
Why Did We Need a Publicly Trusted CA?
The Build or Buy Decision
Design, Implementation, and Maintenance Considerations
Programming Language Choice
Complexity Versus Understandability
Securing Third-Party and Open Source Components
Testing
Resiliency for the CA Key Material
Data Validation
Conclusion

12. Writing Code
Frameworks to Enforce Security and Reliability
Benefits of Using Frameworks
Example: Framework for RPC Backends
Common Security Vulnerabilities
SQL Injection Vulnerabilities: TrustedSqlString
Preventing XSS: SafeHtml
Lessons for Evaluating and Building Frameworks
Simple, Safe, Reliable Libraries for Common Tasks
Rollout Strategy
Simplicity Leads to Secure and Reliable Code
Avoid Multilevel Nesting
Eliminate YAGNI Smells
Repay Technical Debt
Refactoring
Security and Reliability by Default
Choose the Right Tools
Use Strong Types
Sanitize Your Code
Conclusion

13. Testing Code
Unit Testing
Writing Effective Unit Tests
When to Write Unit Tests
How Unit Testing Affects Code
Integration Testing
Writing Effective Integration Tests
Dynamic Program Analysis
Fuzz Testing
How Fuzz Engines Work
Writing Effective Fuzz Drivers
An Example Fuzzer
Continuous Fuzzing
Static Program Analysis
Automated Code Inspection Tools
Integration of Static Analysis in the Developer Workflow
Abstract Interpretation
Formal Methods
Conclusion

14. Deploying Code
Concepts and Terminology
Threat Model
Best Practices
Require Code Reviews
Rely on Automation
Verify Artifacts, Not Just People
Treat Configuration as Code
Securing Against the Threat Model
Advanced Mitigation Strategies
Binary Provenance
Provenance-Based Deployment Policies
Verifiable Builds
Deployment Choke Points
Post-Deployment Verification
Practical Advice
Take It One Step at a Time
Provide Actionable Error Messages
Ensure Unambiguous Provenance
Create Unambiguous Policies
Include a Deployment Breakglass
Securing Against the Threat Model, Revisited
Conclusion

15. Investigating Systems
From Debugging to Investigation
Example: Temporary Files
Debugging Techniques
What to Do When You’re Stuck
Collaborative Debugging: A Way to Teach
How Security Investigations and Debugging Differ
Collect Appropriate and Useful Logs
Design Your Logging to Be Immutable
Take Privacy into Consideration
Determine Which Security Logs to Retain
Budget for Logging
Robust, Secure Debugging Access
Reliability
Security
Conclusion

IV: Maintaining Systems
16. Disaster Planning
Defining “Disaster”
Dynamic Disaster Response Strategies
Disaster Risk Analysis
Setting Up an Incident Response Team
Identify Team Members and Roles
Establish a Team Charter
Establish Severity and Priority Models
Define Operating Parameters for Engaging the IR Team
Develop Response Plans
Create Detailed Playbooks
Ensure Access and Update Mechanisms Are in Place
Prestaging Systems and People Before an Incident
Configuring Systems
Training
Processes and Procedures
Testing Systems and Response Plans
Auditing Automated Systems
Conducting Nonintrusive Tabletops
Testing Response in Production Environments
Red Team Testing
Evaluating Responses
Google Examples
Test with Global Impact
DiRT Exercise Testing Emergency Access
Industry-Wide Vulnerabilities
Conclusion

17. Crisis Management
Is It a Crisis or Not?
Triaging the Incident
Compromises Versus Bugs
Taking Command of Your Incident
The First Step: Don’t Panic!
Beginning Your Response
Establishing Your Incident Team
Operational Security
Trading Good OpSec for the Greater Good
The Investigative Process
Keeping Control of the Incident
Parallelizing the Incident
Handovers
Morale
Communications
Misunderstandings
Hedging
Meetings
Keeping the Right People Informed with the Right Levels of Detail
Putting It All Together
Triage
Declaring an Incident
Communications and Operational Security
Beginning the Incident
Handover
Handing Back the Incident
Preparing Communications and Remediation
Closure
Conclusion

18. Recovery and Aftermath
Recovery Logistics
Recovery Timeline
Planning the Recovery
Scoping the Recovery
Recovery Considerations
Recovery Checklists
Initiating the Recovery
Isolating Assets (Quarantine)
System Rebuilds and Software Upgrades
Data Sanitization
Recovery Data
Credential and Secret Rotation
After the Recovery
Postmortems
Examples
Compromised Cloud Instances
Large-Scale Phishing Attack
Targeted Attack Requiring Complex Recovery
Conclusion

V: Organization and Culture
19. Case Study: Chrome Security Team
Background and Team Evolution
Security Is a Team Responsibility
Help Users Safely Navigate the Web
Speed Matters
Design for Defense in Depth
Be Transparent and Engage the Community
Conclusion

20. Understanding Roles and Responsibilities
Who Is Responsible for Security and Reliability?
The Roles of Specialists
Understanding Security Expertise
Certifications and Academia
Integrating Security into the Organization
Embedding Security Specialists and Security Teams
Example: Embedding Security at Google
Special Teams: Blue and Red Teams
External Researchers
Conclusion

21. Building a Culture of Security and Reliability
Defining a Healthy Security and Reliability Culture
Culture of Security and Reliability by Default
Culture of Review
Culture of Awareness
Culture of Yes
Culture of Inevitably
Culture of Sustainability
Changing Culture Through Good Practice
Align Project Goals and Participant Incentives
Reduce Fear with Risk-Reduction Mechanisms
Make Safety Nets the Norm
Increase Productivity and Usability
Overcommunicate and Be Transparent
Build Empathy
Convincing Leadership
Understand the Decision-Making Process
Build a Case for Change
Pick Your Battles
Escalations and Problem Resolution
Conclusion
Conclusion

A. A Disaster Risk Assessment Matrix

Index

Aanbevolen live events

woensdag 30-09-2026

Jaarcongres Vrouwen met Impact

Seminar

Anderen die dit boek kochten, kochten ook

Gene Kim

The Phoenix Project

€ 26,74
The Open Group

ArchiMate® 3.2 Specification

€ 49,00
Willem Leijnse

Bestuurlijke informatievoorziening

€ 75,95
Harvard Business Review

HBR Guide to Data Analytics Basics for Managers

€ 20,99
Ben Groenendijk

Aan de slag met Excel 365-2024

€ 41,95
Sue Blumenberg

Python for Everybody

€ 15,27

Managementboek Top 100

Bekijk de volledige Managementboek Top 100

Uw winkelwagen

Building Secure and Reliable Systems

Best Practices for Designing, Implementing, and Maintaining Systems

Samenvatting

Specificaties

Lezersrecensies