Practical Machine Learning with H20
Powerful, Scalable Techniques for Deep Learning and AI
Paperback Engels 2016 1e druk 9781491964606Samenvatting
Machine learning has finally come of age. With H2O software, you can perform machine learning and data analysis using a simple open source framework that’s easy to use, has a wide range of OS and language support, and scales for big data. This hands-on guide teaches you how to use H20 with only minimal math and theory behind the learning algorithms.
If you’re familiar with R or Python, know a bit of statistics, and have some experience manipulating data, author Darren Cook will take you through H2O basics and help you conduct machine-learning experiments on different sample data sets. You’ll explore several modern machine-learning techniques such as deep learning, random forests, unsupervised learning, and ensemble learning.
- Learn how to import, manipulate, and export data with H2O
- Explore key machine-learning concepts, such as cross-validation and validation data sets
- Work with three diverse data sets, including a regression, a multinomial classification, and a binomial classification
- Use H2O to analyze each sample data set with four supervised machine-learning algorithms
- Understand how cluster analysis and other unsupervised machine-learning algorithms work
Specificaties
Lezersrecensies
Inhoudsopgave
1. INSTALLATION AND QUICK-START
-PREPARING TO INSTALL
-INSTALL H2O WITH R (CRAN)
-INSTALL H2O WITH PYTHON (PIP)
-OUR FIRST LEARNING
-FLOW
-SUMMARY
2. DATA IMPORT, DATA EXPORT
-MEMORY REQUIREMENTS
-PREPARING THE DATA
-GETTING DATA INTO H2O
-DATA MANIPULATION
-GETTING DATA OUT OF H2O
-SUMMARY
3. THE DATA SETS
-DATA SET: BUILDING ENERGY EFFICIENCY
-DATA SET: HANDWRITTEN DIGITS
-DATA SET: FOOTBALL SCORES
-SUMMARY
4. COMMON MODEL PARAMETERS
-SUPPORTED METRICS
-THE ESSENTIALS
-EFFORT
-SCORING AND VALIDATION
-EARLY STOPPING
-CHECKPOINTS
-CROSS-VALIDATION (AKA K-FOLDS)
-DATA WEIGHTING
-SAMPLING, GENERALIZING
-REGRESSION
-OUTPUT CONTROL
-SUMMARY
5. RANDOM FOREST
-DECISION TREES
-RANDOM FOREST
-PARAMETERS
-BUILDING ENERGY EFFICIENCY: DEFAULT RANDOM FOREST
-GRID SEARCH
-BUILDING ENERGY EFFICIENCY: TUNED RANDOM FOREST
-MNIST: DEFAULT RANDOM FOREST
-MNIST: TUNED RANDOM FOREST
-FOOTBALL: DEFAULT RANDOM FOREST
-FOOTBALL: TUNED RANDOM FOREST
-SUMMARY
6. GRADIENT BOOSTING MACHINES
-BOOSTING
-THE GOOD, THE BAD, AND… THE MYSTERIOUS
-PARAMETERS
-BUILDING ENERGY EFFICIENCY: DEFAULT GBM
-BUILDING ENERGY EFFICIENCY: TUNED GBM
-MNIST: DEFAULT GBM
-MNIST: TUNED GBM
-FOOTBALL: DEFAULT GBM
-FOOTBALL: TUNED GBM
-SUMMARY
7. LINEAR MODELS
-GLM PARAMETERS
-BUILDING ENERGY EFFICIENCY: DEFAULT GLM
-BUILDING ENERGY EFFICIENCY: TUNED GLM
-MNIST: DEFAULT GLM
-MNIST: TUNED GLM
-FOOTBALL: DEFAULT GLM
-FOOTBALL: TUNED GLM
-SUMMARY
8. DEEP LEARNING (NEURAL NETS)
-WHAT ARE NEURAL NETS?
-PARAMETERS
-BUILDING ENERGY EFFICIENCY: DEFAULT DEEP LEARNING
-BUILDING ENERGY EFFICIENCY: TUNED DEEP LEARNING
-MNIST: DEFAULT DEEP LEARNING
-MNIST: TUNED DEEP LEARNING
-FOOTBALL: DEFAULT DEEP LEARNING
-FOOTBALL: TUNED DEEP LEARNING
-SUMMARY
-APPENDIX: MORE DEEP LEARNING PARAMETERS
9. UNSUPERVISED LEARNING
-K-MEANS CLUSTERING
-DEEP LEARNING AUTO-ENCODER
-PRINCIPAL COMPONENT ANALYSIS
-GLRM
-MISSING DATA
-SUMMARY
10. EVERYTHING ELSE
-STAYING ON TOP OF AND POKING INTO THINGS
-INSTALLING THE LATEST VERSION
-RUNNING FROM THE COMMAND LINE
-CLUSTERS
-SPARK / SPARKLING WATER
-NAIVE BAYES
-ENSEMBLES
-SUMMARY
11. EPILOGUE: DIDN’T THEY ALL DO WELL!
-BUILDING ENERGY RESULTS
-MNIST RESULTS
-FOOTBALL DATA
-HOW LOW CAN YOU GO?
-SUMMARY
INDEX
Rubrieken
- advisering
- algemeen management
- coaching en trainen
- communicatie en media
- economie
- financieel management
- inkoop en logistiek
- internet en social media
- it-management / ict
- juridisch
- leiderschap
- marketing
- mens en maatschappij
- non-profit
- ondernemen
- organisatiekunde
- personal finance
- personeelsmanagement
- persoonlijke effectiviteit
- projectmanagement
- psychologie
- reclame en verkoop
- strategisch management
- verandermanagement
- werk en loopbaan