Data Science with Java: Practical Methods for Scientists and Engineers

Michael R. Brzustowicz PhD

商品描述

Data Science is booming thanks to R and Python, but Java brings the robustness, convenience, and ability to scale critical to today’s data science applications. With this practical book, Java software engineers looking to add data science skills will take a logical journey through the data science pipeline. Author Michael Brzustowicz explains the basic math theory behind each step of the data science process, as well as how to apply these concepts with Java.

You’ll learn the critical roles that data IO, linear algebra, statistics, data operations, learning and prediction, and Hadoop MapReduce play in the process. Throughout this book, you’ll find code examples you can use in your applications.

  • Examine methods for obtaining, cleaning, and arranging data into its purest form
  • Understand the matrix structure that your data should take
  • Learn basic concepts for testing the origin and validity of data
  • Transform your data into stable and usable numerical values
  • Understand supervised and unsupervised learning algorithms, and methods for evaluating their success
  • Get up and running with MapReduce, using customized components suitable for data science algorithms

商品描述(中文翻譯)

數據科學因 R 和 Python 而蓬勃發展,但 Java 帶來了對於當今數據科學應用至關重要的穩健性、便利性和可擴展性。在這本實用書中,希望增加數據科學技能的 Java 軟體工程師將通過一個邏輯的旅程,深入了解數據科學流程。作者 Michael Brzustowicz 解釋了數據科學過程中每個步驟背後的基本數學理論,以及如何使用 Java 應用這些概念。

您將學習數據 IO、線性代數、統計學、數據操作、學習和預測以及 Hadoop MapReduce 在整個過程中扮演的關鍵角色。在本書中,您將找到可在應用程式中使用的程式碼示例。

以下是本書的內容大綱:
- 檢視獲取、清理和整理數據的方法
- 了解數據應該具有的矩陣結構
- 學習測試數據的來源和有效性的基本概念
- 將數據轉換為穩定且可用的數值
- 了解監督和非監督學習算法,以及評估其成功的方法
- 使用適用於數據科學算法的自定義組件,快速上手並運行 MapReduce

這本書將幫助 Java 軟體工程師進入數據科學領域,並將數據科學概念應用於他們的應用程式中。