Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning (Integrated Series in Information Systems)

Shan Suthaharan

商品描述

This book presents machine learning models and algorithms to address big data classification problems. Existing machine learning techniques like the decision tree (a hierarchical approach), random forest (an ensemble hierarchical approach), and deep learning (a layered approach) are highly suitable for the system that can handle such problems. This book helps readers, especially students and newcomers to the field of big data and machine learning, to gain a quick understanding of the techniques and technologies; therefore, the theory, examples, and programs (Matlab and R) presented in this book have been simplified, hardcoded, repeated, or spaced for improvements. They provide vehicles to test and understand the complicated concepts of various topics in the field. It is expected that the readers adopt these programs to experiment with the examples, and then modify or write their own programs toward advancing their knowledge for solving more complex and challenging problems.

The presentation format of this book focuses on simplicity, readability, and dependability so that both undergraduate and graduate students as well as new researchers, developers, and practitioners in this field can easily trust and grasp the concepts, and learn them effectively. It has been written to reduce the mathematical complexity and help the vast majority of readers to understand the topics and get interested in the field. This book consists of four parts, with the total of 14 chapters. The first part mainly focuses on the topics that are needed to help analyze and understand data and big data. The second part covers the topics that can explain the systems required for processing big data. The third part presents the topics required to understand and select machine learning techniques to classify big data. Finally, the fourth part concentrates on the topics that explain the scaling-up machine learning, an important solution for modern big data problems.

商品描述(中文翻譯)

本書介紹了機器學習模型和算法,以應對大數據分類問題。現有的機器學習技術,如決策樹(一種分層方法)、隨機森林(一種集成分層方法)和深度學習(一種分層方法),非常適合處理這類問題的系統。本書幫助讀者,特別是大數據和機器學習領域的學生和新手,快速理解這些技術和技術;因此,本書中呈現的理論、示例和程序(Matlab和R)已經簡化、硬編碼、重複或間隔,以便改進。它們提供了測試和理解該領域各個主題中複雜概念的工具。預計讀者們會使用這些程序來實驗這些示例,然後修改或編寫自己的程序,以提升解決更複雜和具有挑戰性問題的知識。

本書的呈現形式注重簡潔、易讀和可靠,以便本科生、研究生以及這一領域的新研究人員、開發人員和從業人員能夠輕松信任和掌握概念,並有效地學習它們。它的撰寫旨在減少數學的複雜性,幫助絕大多數讀者理解這些主題並對該領域產生興趣。本書分為四個部分,共14章。第一部分主要關注幫助分析和理解數據和大數據所需的主題。第二部分涵蓋了解釋處理大數據所需的系統的主題。第三部分介紹了理解和選擇機器學習技術以分類大數據所需的主題。最後,第四部分專注於解釋擴展機器學習的主題,這是現代大數據問題的重要解決方案。