Mining of Massive Datasets, 3/e (Hardcover)

Leskovec, Jure, Rajaraman, Anand, Ullman, Jeffrey David

買這商品的人也買了...

商品描述

Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the MapReduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream-processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets, and clustering. This third edition includes new and extended coverage on decision trees, deep learning, and mining social-network graphs.

商品描述(中文翻譯)

由資料庫和網路技術的領先專家撰寫,這本書對學生和從業人員來說都是必讀之作。網路和網路商務的普及提供了許多極大的資料集,可以通過資料挖掘獲取信息。本書專注於實用的演算法,這些演算法已被用於解決資料挖掘中的關鍵問題,並且可以成功應用於最大的資料集。它首先討論了MapReduce框架,這是一個自動並行化演算法的重要工具。作者解釋了局部敏感哈希和流處理演算法的技巧,用於挖掘資料,這些資料到達速度過快,無法進行全面處理。其他章節涵蓋了PageRank概念和相關的網路組織技巧,尋找頻繁項集的問題以及聚類。這本第三版包括了關於決策樹、深度學習和挖掘社交網路圖的新內容和擴展範圍。