Practical Machine Learning with H2O: Powerful, Scalable Techniques for Deep Learning and AI

Darren Cook





<Table of Contents>

Chapter 1Installation and Quick-Start
Preparing to Install
Install H2O with R (CRAN)
Install H2O with Python (pip)
Our First Learning
Chapter 2Data Import, Data Export
Memory Requirements
Preparing the Data
Getting Data into H2O
Data Manipulation
Getting Data Out of H2O
Chapter 3The Data Sets
Data Set: Building Energy Efficiency
Data Set: Handwritten Digits
Data Set: Football Scores
Chapter 4Common Model Parameters
Supported Metrics
The Essentials
Scoring and Validation
Early Stopping
Cross-Validation (aka k-folds)
Data Weighting
Sampling, Generalizing
Output Control
Chapter 5Random Forest
Decision Trees
Random Forest
Building Energy Efficiency: Default Random Forest
Grid Search
Building Energy Efficiency: Tuned Random Forest
MNIST: Default Random Forest
MNIST: Tuned Random Forest
Football: Default Random Forest
Football: Tuned Random Forest
Chapter 6Gradient Boosting Machines
The Good, the Bad, and… the Mysterious
Building Energy Efficiency: Default GBM
Building Energy Efficiency: Tuned GBM
MNIST: Default GBM
Football: Default GBM
Football: Tuned GBM
Chapter 7Linear Models
GLM Parameters
Building Energy Efficiency: Default GLM
Building Energy Efficiency: Tuned GLM
MNIST: Default GLM
Football: Default GLM
Football: Tuned GLM
Chapter 8Deep Learning (Neural Nets)
What Are Neural Nets?
Building Energy Efficiency: Default Deep Learning
Building Energy Efficiency: Tuned Deep Learning
MNIST: Default Deep Learning
MNIST: Tuned Deep Learning
Football: Default Deep Learning
Football: Tuned Deep Learning
Appendix: More Deep Learning Parameters
Chapter 9Unsupervised Learning
K-Means Clustering
Deep Learning Auto-Encoder
Principal Component Analysis
Missing Data
Chapter 10Everything Else
Staying on Top of and Poking into Things
Installing the Latest Version
Running from the Command Line
Spark / Sparkling Water
Naive Bayes
Chapter 11Epilogue: Didn’t They All Do Well!
Building Energy Results
MNIST Results
Football Data
How Low Can You Go?

<About the Author>

Darren Cook
Darren Cook has over 20 years of experience as a software developer, data analyst, and technical director, working on everything from financial trading systems to NLP, data visualization tools, and PR websites for some of the world’s largest brands. He is skilled in a wide range of computer languages, including R, C++, PHP, JavaScript, and Python. He works at QQ Trend, a financial data analysis and data products company.


The animal on the cover of Practical Machine Learning with H2O is a crayfish, a small lobster-like crustacean found in freshwater habitats throughout the world. Alternate names include crawfish, crawdads, and mudbugs, depending on the region.

There are over 500 species of crayfish, over half of which occur in North America. There is great variation in size, shape, and color across species. Crayfish are typically 3 to 4 inches in North America, while certain species in Australia grow to be a staggering 15 inches and can weigh as much as 8 pounds.

Like crabs and other crustaceans, crayfish shed their hard outer shells periodically, eating them to recoup calcium. They are nocturnal creatures, possessing keen eyesight as well as the ability to move their eyes in different directions at once.

Crayfish have eight pairs of legs, four of which are used for walking. The other legs are used for swimming backward, a maneuver that allows the crayfish to dart quickly through the water. Lost limbs can be regenerated, a capability that comes in handy during the competitive (and often aggressive) mating season.

Crayfish are opportunistic omnivores who consume almost anything, including plants, clams, snails, insects, and dead organic matter. Their own predators include fish (they are widely regarded as a tackle box staple), otters, birds, and humans. More than 100 million pounds of crawfish are produced each year in Louisiana, where it was adopted as the state's official crustacean in 1983.

Many of the animals on O'Reilly covers are endangered; all of them are important to the world. To learn more about how you can help, go to .

The cover image is from Treasury of Animal Illustrations by Dover. The cover fonts are URW Typewriter and Guardian Sans. The text font is Adobe Minion Pro; the heading font is Adobe Myriad Condensed; and the code font is Dalton Maag's Ubuntu Mono.