解析深度學習(第2版)

魏秀參

  • 出版商: 電子工業
  • 出版日期: 2025-01-01
  • 定價: $774
  • 售價: 8.5$658
  • 語言: 簡體中文
  • 頁數: 344
  • ISBN: 7121491664
  • ISBN-13: 9787121491665
  • 相關分類: DeepLearning
  • 下單後立即進貨 (約4週~6週)

商品描述

深度學習是一種以人工神經網絡等為架構,對數據資料進行表示學習的算法,它是電腦科學及人工智能的重要分支,其代表性成果如捲積神經網絡、循環神經網絡等作為信息產業與工業因特網等行業的主流工具性技術已被成功應用於諸多現實場景.本書作為一本面向中文讀者的深度學習教科書,從“理論與實踐相結合”的角度立意,貫徹“知行合一”的教育理念. 全書除緒論和附錄外,共有15 章,分四篇:第一篇“機器學習”(第1 章),介紹機器學習的基本術語、基礎理論與模型;第二篇“深度學習基礎”(第2 ~ 5 章),介紹深度學習基本概念、捲積神經網絡、循環神經網絡和Transformer 網絡等內容;第三篇“深度學習實踐”(第6 ~ 14 章),介紹深度學習模型自數據準備開始,到網絡參數初始化、不同網絡部件的選擇、網絡配置、網絡模型訓練、不平衡樣本的處理,最終到模型集成等實踐應用技巧和經驗;第四篇“深度學習進階”(第15 章),本篇以電腦視覺中的基礎任務為例,介紹深度學習的進階發展及其應用情況.本書不是一本編程類圖書,而是希望通過“基礎知識”和“實踐技巧”兩方面的內容使讀者從更高維度瞭解、掌握並成功構建針對自身應用的深度學習模型.本書可作為高等院校電腦、人工智能、自動化及相關專業的本科生或研究生教材,也可供對深度學習感興趣的研究人員和工程技術人員閱讀參考.

目錄大綱

第0 章緒論·············1
0.1 引子·············...1
0.2 人工智能············..2
0.3 深度學習············..5
參考文獻··············9
第一篇機器學習···········...11
第1 章機器學習基礎··········13
1.1 機器學習基本術語··········.13
1.2 模型評估與選擇··········...14
1.2.1 經驗誤差與過擬合·········15
1.2.2 常用的評估方法·········..16
1.2.3 性能度量···········.17
1.2.4 偏差與方差··········...20
1.3 線性模型············..21
1.3.1 基本形式···········.21
1.3.2 線性回歸···········.22
1.3.3 線性判別分析··········23
1.3.4 多分類學習··········...25
1.3.5 支持向量機··········...26
1.4 機器學習基本理論··········.29
1.4.1 PAC 學習理論··········30
1.4.2 No Free Lunch 定理········..31
1.4.3 奧卡姆剃刀原理·········..32
1.4.4 歸納偏置···········.32
1.5 總結與擴展閱讀··········...33
目錄ix
1.6 習題·············...34
參考文獻··············35
第二篇深度學習基礎··········.37
第2 章深度學習基本概念········...39
2.1 發展歷程············..39
2.2 “端到端”思想··········...43
2.3 基本結構············..45
2.4 前饋運算············..47
2.5 反饋運算············..47
2.6 總結與擴展閱讀··········...50
2.7 習題·············...51
參考文獻··············52
第3 章捲積神經網絡··········54
3.1 捲積神經網絡基本組件與操作·······.54
3.1.1 符號表示···········.55
3.1.2 捲積層···········...55
3.1.3 匯合層···········...59
3.1.4 激活函數···········.61
3.1.5 全連接層···········.63
3.1.6 目標函數···········.63
3.2 捲積神經網絡經典結構·········64
3.2.1 捲積神經網絡結構中的重要概念·····...64
3.2.2 經典網絡案例分析·········68
3.3 捲積神經網絡的壓縮·········..81
3.3.1 低秩近似···········.82
3.3.2 剪枝與稀疏約束·········..84
3.3.3 參數量化···········.88
3.3.4 二值網絡···········.91
3.3.5 知識蒸餾···········.93
x 目錄
3.3.6 緊湊的網絡結構·········..95
3.4 總結與擴展閱讀··········...97
3.5 習題·············...99
參考文獻··············101
第4 章循環神經網絡··········106
4.1 循環神經網絡基本組件與操作·······.106
4.1.1 符號表示···········.107
4.1.2 記憶模塊···········.108
4.1.3 參數學習···········.110
4.1.4 長程依賴問題··········113
4.2 循環神經網絡經典結構·········115
4.2.1 長短時記憶網絡·········..115
4.2.2 門控循環單元網絡·········117
4.2.3 堆疊循環神經網絡·········118
4.2.4 雙向循環神經網絡·········119
4.3 循環神經網絡拓展··········.120
4.3.1 遞歸神經網絡··········121
4.3.2 圖神經網絡··········...122
4.4 循環神經網絡訓練··········.123
4.5 總結與擴展閱讀··········...124
4.6 習題·············...126
參考文獻··············128
第5 章Transformer 網絡········.130
5.1 Transformer 網絡基本組件與操作······.130
5.1.1 符號表示···········.131
5.1.2 位置編碼···········.132
5.1.3 多頭註意力機制·········..133
5.1.4 編碼器···········...137
5.1.5 解碼器···········...138
目錄xi
5.2 Transformer 網絡經典結構········141
5.2.1 Transformer-XL········...141
5.2.2 Longformer··········...145
5.2.3 Reformer···········.146
5.2.4 Universal Transformer·······.149
5.3 Transformer 網絡訓練·········.151
5.4 總結與擴展閱讀··········...154
5.5 習題·············...156
參考文獻··············157
第三篇深度學習實踐··········.159
第6 章數據擴充與數據預處理·······.161
6.1 簡單的數據擴充方式·········..161
6.2 特殊的數據擴充方式·········..162
6.2.1 Fancy PCA·········...162
6.2.2 監督式數據擴充·········..163
6.2.3 mixup 法···········.164
6.2.4 自動化數據擴充·········..169
6.3 深度學習數據預處理·········..171
6.4 總結與擴展閱讀··········...173
6.5 習題·············...174
參考文獻··············177
第7 章網絡參數初始化·········..179
7.1 全零初始化············179
7.2 隨機初始化············181
7.3 其他初始化方法··········...194
7.4 總結與擴展閱讀··········...194
7.5 習題·············...195
參考文獻··············197
xii 目錄
第8 章激活函數···········..198
8.1 Sigmoid 函數···········..198
8.2 tanh(x) 函數···········..199
8.3 修正線性單元(ReLU)········...200
8.4 Leaky ReLU···········201
8.5 參數化ReLU··········...201
8.6 隨機化ReLU··········...203
8.7 指數化線性單元(ELU)········..204
8.8 激活函數實踐···········..204
8.9 總結與擴展閱讀··········...206
8.10 習題·············..207
參考文獻··············209
第9 章目標函數···········..210
9.1 分類任務的目標函數·········..210
9.1.1 交叉熵損失函數·········..210
9.1.2 合頁損失函數··········216
9.1.3 坡道損失函數··········219
9.1.4 大間隔交叉熵損失函數········221
9.1.5 中心損失函數··········228
9.2 回歸任務的目標函數·········..231
9.2.1 ?1 損失函數··········..232
9.2.2 ?2 損失函數··········..235
9.2.3 Tukey’s biweight 損失函數·······238
9.3 其他任務的目標函數·········..239
9.4 總結與擴展閱讀··········...241
9.5 習題·············...242
參考文獻··············244
第10 章網絡正則化··········..245
10.1 ?2 正則化············.246
目錄xiii
10.2 ?1 正則化············.246
10.3 最大範數約束···········247
10.4 隨機失活············.247
10.5 驗證集的使用···········249
10.6 總結與擴展閱讀··········..250
10.7 習題·············..251
參考文獻··············253
第11 章超參數設定和網絡訓練······...254
11.1 網絡超參數設定··········..254
11.1.1 輸入圖像像素大小········...254
11.1.2 捲積層超參數的設定········.255
11.1.3 匯合層超參數的設定········.256
11.2 訓練技巧············.256
11.2.1 訓練數據隨機打亂········...256
11.2.2 學習率的設定·········...256
11.2.3 批規範化操作·········...258
11.2.4 網絡模型優化算法選擇·······...271
11.2.5 微調神經網絡·········...282
11.3 總結與擴展閱讀··········..286
11.4 習題·············..288
參考文獻··············289
第12 章不平衡樣本的處理········.291
12.1 數據層面處理方法··········292
12.1.1 數據重採樣··········..292
12.1.2 類別平衡採樣·········...292
12.2 算法層面處理方法··········294
12.2.1 代價敏感方法·········...294
12.2.2 代價敏感法中權重的指定方式······295
12.3 總結與擴展閱讀··········..297
12.4 習題·············..298
xiv 目錄
參考文獻··············300
第13 章模型集成方法·········...301
13.1 數據層面集成方法··········301
13.1.1 測試階段數據擴充········...301
13.1.2 簡易集成法··········..301
13.2 模型層面集成方法··········302
13.2.1 單模型集成··········..302
13.2.2 多模型集成··········..303
13.3 總結與擴展閱讀··········..306
13.4 習題·············..307
參考文獻··············308
第14 章深度學習開源工具簡介······...310
14.1 常用框架對比···········310
14.2 代表性框架的各自特點········...312
14.2.1 Caffe···········312
14.2.2 Jittor···········312
14.2.3 Keras···········313
14.2.4 MatConvNet·········.313
14.2.5 MindSpore·········...314
14.2.6 MXNet··········..314
14.2.7 PyTorch··········.314
14.2.8 TensorFlow·········..315
14.2.9 Theano··········..315
14.2.10 Torch··········...316
參考文獻··············317
第四篇深度學習進階··········.319
第15 章電腦視覺進階與應用······...321
15.1 圖像識別············.321
15.1.1 數據集和評價指標········...322
目錄xv
15.1.2 Inception 模型·········...325
15.1.3 ResNet 及其衍生模型········326
15.1.4 EfficientNet 模型·········330
15.1.5 Vision Transformer 模型·······.332
15.2 目標檢測············.335
15.2.1 數據集和評價指標········...335
15.2.2 二階段目標檢測方法········.338
15.2.3 單階段目標檢測方法········.342
15.3 圖像分割············.346
15.3.1 數據集和評價指標········...346
15.3.2 語義分割···········348
15.3.3 實例分割···········353
15.4 視頻理解············.356
15.4.1 數據集和評價指標········...357
15.4.2 基於2D 捲積的動作識別·······.359
15.4.3 基於3D 捲積的動作識別·······.363
15.4.4 時序動作定位·········...367
15.5 總結與拓展閱讀··········..373
15.6 習題·············..373
參考文獻··············375
附錄················379
附錄A 向量、矩陣及其基本運算······.381
A.1 向量及其基本運算··········381
A.1.1 向量············.381
A.1.2 向量範數···········381
A.1.3 向量運算···········382
A.2 矩陣及其基本運算··········382
A.2.1 矩陣············.382
A.2.2 矩陣範數···········383
A.2.3 矩陣運算···········383
xvi 目錄
附錄B 微積分············384
B.1 微分·············...384
B.1.1 導數············.384
B.1.2 可微函數···········385
B.1.3 泰勒公式···········385
B.2 積分·············...386
B.2.1 不定積分···········386
B.2.2 定積分···········..387
B.2.3 常見積分公式··········387
B.3 矩陣微積分···········...388
B.3.1 矩陣微分···········388
B.3.2 矩陣積分···········389
B.4 常見函數的導數··········...390
B.5 鏈式法則············..391
B.6 隨機梯度下降···········.392
附錄C 線性代數···········..395
C.1 線性代數與深度學習·········..395
C.2 矩陣類型············..396
C.3 特徵值與特徵向量··········397
C.4 矩陣分解············..397
C.4.1 LU 分解···········.398
C.4.2 QR 分解···········.398
C.4.3 奇異值分解··········..399
附錄D 概率論············400
D.1 概率論與深度學習··········400
D.2 樣本空間············..401
D.3 事件和概率···········...401
D.4 條件概率分佈···········.402
D.5 貝葉斯定理···········...402
目錄xvii
D.6 隨機變量、期望和方差········...403
D.6.1 隨機變量···········403
D.6.2 期望············404
D.6.3 方差············405
D.7 常見的連續隨機變量概率分佈·······405