Audio and Speech Processing with MATLAB (Paperback)

Hill, Paul

  • 出版商: CRC
  • 出版日期: 2020-09-30
  • 售價: $2,640
  • 貴賓價: 9.5$2,508
  • 語言: 英文
  • 頁數: 330
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 0367656310
  • ISBN-13: 9780367656317
  • 相關分類: Matlab
  • 其他版本: Audio and Speech Processing with MATLAB
  • 立即出貨 (庫存=1)

買這商品的人也買了...

商品描述

Speech and audio processing has undergone a revolution in preceding decades that has accelerated in the last few years generating game-changing technologies such as truly successful speech recognition systems; a goal that had remained out of reach until very recently. This book gives the reader a comprehensive overview of such contemporary speech and audio processing techniques with an emphasis on practical implementations and illustrations using MATLAB code. Core concepts are firstly covered giving an introduction to the physics of audio and vibration together with their representations using complex numbers, Z transforms and frequency analysis transforms such as the FFT.   

 

 

Later chapters give a description of the human auditory system and the fundamentals of psychoacoustics.  Insights, results, and analyses given in these chapters are subsequently used as the basis of understanding of the middle section of the book covering: wideband audio compression (MP3 audio etc.), speech recognition and speech coding. 

 

The final chapter covers musical synthesis and applications describing methods such as (and giving MATLAB examples of) AM, FM and ring modulation techniques. This chapter gives a final example of the use of time-frequency modification to implement a so-called phase vocoder for time stretching (in MATLAB).

 

 

Features

  • A comprehensive overview of contemporary speech and audio processing techniques from perceptual and physical acoustic models to a thorough background in relevant digital signal processing techniques together with an exploration of speech and audio applications. 
  • A carefully paced progression of complexity of the described methods; building, in many cases, from first principles.
  • Speech and wideband audio coding together with a description of associated standardised codecs (e.g. MP3, AAC and GSM).
  • Speech recognition: Feature extraction (e.g. MFCC features), Hidden Markov Models (HMMs) and deep learning techniques such as Long Short-Time Memory (LSTM) methods.
  • Book and computer-based problems at the end of each chapter.
  • Contains numerous real-world examples backed up by many MATLAB functions and code.

 

 

商品描述(中文翻譯)

語音和音頻處理在過去幾十年中經歷了一場革命,並在最近幾年加速發展,產生了具有顛覆性影響的技術,例如真正成功的語音識別系統,這是一個直到最近都無法實現的目標。本書全面介紹了當代語音和音頻處理技術,重點放在實際實現和使用MATLAB代碼進行示例上。首先介紹了核心概念,包括音頻和振動的物理學以及使用複數、Z變換和頻率分析變換(如FFT)進行表示的方法。

後面的章節描述了人類聽覺系統和心理聽覺學的基礎知識。這些章節中提供的見解、結果和分析隨後被用作理解本書中間部分的基礎,該部分涵蓋了寬頻音頻壓縮(如MP3音頻等)、語音識別和語音編碼。

最後一章涵蓋了音樂合成和應用,描述了AM、FM和環形調制等方法(並提供MATLAB示例)。該章節最後通過使用時頻修改來實現所謂的相位調變器(在MATLAB中進行時間拉伸)。

特點:
- 從感知和物理聲學模型到相關數字信號處理技術的全面概述,以及對語音和音頻應用的深入探索。
- 所描述方法的復雜性按照仔細的步驟進行漸進增加,很多情況下從基本原理開始。
- 語音和寬頻音頻編碼,以及相關的標準編碼器的描述(如MP3、AAC和GSM)。
- 語音識別:特徵提取(如MFCC特徵)、隱馬爾可夫模型(HMM)和深度學習技術,如長短期記憶(LSTM)方法。
- 每章末尾都有書籍和基於計算機的問題。
- 包含許多實際示例,並提供許多MATLAB函數和代碼支持。

作者簡介

Dr Paul Hill received his B.Sc degree from the Open University (1996), an M.Sc degree from the University of Bristol, Bristol, U.K. (1998) and a Ph.D. also from the University of Bristol (2002). His research interests include image and video analysis, compression, fusion and multiscale transforms together with audio applications such as compression, retrieval and signal separation. He is currently a senior research fellow at the Department of Electrical and Electronic Engineering at the University of Bristol. He has taught the speech and audio processing course that the university for over 8 years and has supervised numerous audio MSc projects over that time. He has published over 30 academic papers and is also an amateur musician and composer often reflecting his passion for electronic music in his lectures and presentations.

作者簡介(中文翻譯)

Dr. Paul Hill於1996年從Open University獲得學士學位,於1998年從英國布里斯托大學獲得碩士學位,並於2002年再次從布里斯托大學獲得博士學位。他的研究興趣包括影像和視頻分析、壓縮、融合和多尺度轉換,以及音頻應用,如壓縮、檢索和信號分離。他目前是布里斯托大學電氣與電子工程系的高級研究員。他在該大學教授語音和音頻處理課程已有8年之久,並在此期間指導了許多音頻碩士項目。他發表了30多篇學術論文,同時也是一位業餘音樂家和作曲家,他的講座和演講常常反映出他對電子音樂的熱情。