Apache Spark Machine Learning Blueprints (Paperback)

Alex Liu

  • 出版商: Packt Publishing
  • 出版日期: 2016-01-29
  • 定價: $1,330
  • 售價: 8.0$1,064
  • 語言: 英文
  • 頁數: 252
  • 裝訂: Paperback
  • ISBN: 178588039X
  • ISBN-13: 9781785880391
  • 相關分類: SparkMachine Learning
  • 立即出貨 (庫存=1)

買這商品的人也買了...

商品描述

Key Features

  • Customize Apache Spark and R to fit your analytical needs in customer research, fraud detection, risk analytics, and recommendation engine development
  • Develop a set of practical Machine Learning applications that can be implemented in real-life projects
  • A comprehensive, project-based guide to improve and refine your predictive models for practical implementation

Book Description

There's a reason why Apache Spark has become one of the most popular tools in Machine Learning – its ability to handle huge datasets at an impressive speed means you can be much more responsive to the data at your disposal. This book shows you Spark at its very best, demonstrating how to connect it with R and unlock maximum value not only from the tool but also from your data.

Packed with a range of project "blueprints" that demonstrate some of the most interesting challenges that Spark can help you tackle, you'll find out how to use Spark notebooks and access, clean, and join different datasets before putting your knowledge into practice with some real-world projects, in which you will see how Spark Machine Learning can help you with everything from fraud detection to analyzing customer attrition. You'll also find out how to build a recommendation engine using Spark's parallel computing powers.

What you will learn

  • Set up Apache Spark for machine learning and discover its impressive processing power
  • Combine Spark and R to unlock detailed business insights essential for decision making
  • Build machine learning systems with Spark that can detect fraud and analyze financial risks
  • Build predictive models focusing on customer scoring and service ranking
  • Build a recommendation systems using SPSS on Apache Spark
  • Tackle parallel computing and find out how it can support your machine learning projects
  • Turn open data and communication data into actionable insights by making use of various forms of machine learning

About the Author

Alex Liu is an expert in research methods and data science. He is currently one of IBM's leading experts in Big Data analytics and also a lead data scientist, where he serves big corporations, develops Big Data analytics IPs, and speaks at industrial conferences such as STRATA, Insights, SMAC, and BigDataCamp. In the past, Alex served as chief or lead data scientist for a few companies, including Yapstone, RS, and TRG. Before this, he was a lead consultant and director at RMA, where he provided data analytics consultation and training to many well-known organizations, including the United Nations, Indymac, AOL, Ingram Micro, GEM, Farmers Insurance, Scripps Networks, Sears, and USAID. At the same time, he taught advanced research methods to PhD candidates at University of Southern California and University of California at Irvine. Before this, he worked as a managing director for CATE/GEC and as a research fellow for the Asia/Pacific Research Center at Stanford University. Alex has a Ph.D. in quantitative sociology and a master's degree of science in statistical computing from Stanford University.

Table of Contents

  1. Spark for Machine Learning
  2. Data Preparation for Spark ML
  3. A Holistic View on Spark
  4. Fraud Detection on Spark
  5. Risk Scoring on Spark
  6. Churn Prediction on Spark
  7. Recommendations on Spark
  8. Learning Analytics on Spark
  9. City Analytics on Spark
  10. Learning Telco Data on Spark
  11. Modeling Open Data on Spark

商品描述(中文翻譯)

主要特點


  • 自定義 Apache Spark 和 R,以滿足在客戶研究、詐騙檢測、風險分析和推薦引擎開發方面的分析需求

  • 開發一系列實用的機器學習應用,可在實際項目中實施

  • 一個全面的、以項目為基礎的指南,以改進和完善您的預測模型,以便實際應用

書籍描述

Apache Spark 成為機器學習中最受歡迎的工具之一,原因在於它能夠以驚人的速度處理大型數據集,這意味著您可以更加靈活地應對手頭的數據。本書展示了 Spark 的最佳應用,演示了如何將其與 R 連接,不僅從工具本身,還從數據中獲得最大價值。

本書提供了一系列項目“藍圖”,展示了 Spark 可以幫助您應對的一些最有趣的挑戰,您將了解如何使用 Spark 筆記本,訪問、清理和連接不同的數據集,然後通過一些實際項目將您的知識付諸實踐,其中您將看到 Spark 機器學習如何幫助您從詐騙檢測到分析客戶流失等各個方面。您還將了解如何使用 Spark 的並行計算能力構建推薦引擎。

您將學到什麼


  • 設置 Apache Spark 進行機器學習,並發現其令人印象深刻的處理能力

  • 結合 Spark 和 R,解鎖對決策至關重要的詳細業務洞察

  • 使用 Spark 構建能夠檢測詐騙和分析金融風險的機器學習系統

  • 構建以客戶評分和服務排名為重點的預測模型

  • 使用 Spark 上的 SPSS 構建推薦系統

  • 處理並行計算,了解它如何支持您的機器學習項目

  • 通過利用各種形式的機器學習,將開放數據和通信數據轉化為可行動的洞察

關於作者

Alex Liu 是研究方法和數據科學方面的專家。他目前是 IBM 在大數據分析方面的領先專家之一,也是一位首席數據科學家,為大型企業提供服務,開發大數據分析知識產權,並在 STRATA、Insights、SMAC 和 BigDataCamp 等行業會議上發表演講。在過去,Alex 曾擔任 Yapstone、RS 和 TRG 等公司的首席或首席數據科學家。在此之前,他曾在 RMA 擔任首席顧問和董事,為包括聯合國、Indymac、AOL、Ingram Micro、GEM、Farmers Insurance、Scripps Networks、Sears 和 USAID 在內的許多知名組織提供數據分析諮詢和培訓。同時,他還在南加州大學和加州大學爾灣分校為博士候選人教授高級研究方法。在此之前,他曾擔任 CATE/GEC 的董事總經理,並在斯坦福大學亞太研究中心擔任研究員。Alex 擁有斯坦福大學的量化社會學博士學位和統計計算的碩士學位。

目錄


  1. 機器學習的 Spark

  2. 為 Spark ML 準備數據

  3. Spark 的整體觀

  4. Spark 上的詐騙檢測

  5. Spark 上的風險評分

  6. Spark 上的客戶流失預測

  7. Spark 上的推薦系統

  8. Spark 上的學習分析

  9. Spark 上的城市分析

  10. Spark 上的電信數據學習

  11. Spark 上的建模開放數據