Improvements in Speech Synthesis

Eric Keller, G. Gailly

  • 出版商: Wiley
  • 出版日期: 2001-11-28
  • 售價: $1,300
  • 貴賓價: 9.8$1,274
  • 語言: 英文
  • 頁數: 408
  • 裝訂: Hardcover
  • ISBN: 0471499854
  • ISBN-13: 9780471499855
  • 下單後立即進貨 (約5~7天)

買這商品的人也買了...

商品描述

Naturalness in synthetic speech is one of the most intractable problems in information technology today. Although speech synthesis systems have improved considerably over the last 20 years, they rarely sound entirely like human speakers.

Why is this so, and what can be done about it? Prosodic processing must be rendered more varied and more appropriate to the speech situation

  • Timing, melodic control and the relationships between the various prosodic parameters need increased attention

  • Signal processing systems must be developed and perfected that are capable of generating more than just one voice from a database

  • A better understanding must be achieved of what distinguishes one voice from another, and of how speech styles differ between simply reading aloud numbers and sentences and their use in interactive speech

  • New evaluation methodologies should be developed to provide objective and subjective measurements of the intelligibility of the synthetic speech and the cognitive load imposed upon the listener by impoverished stimuli

  • Adequate text markup systems must be proposed and tested with multiple languages in real-world situations

  • Further research is required to integrate speech synthesis systems into larger natural-language processing systems
      Improvements in Speech Synthesis presents the latest research in the above areas. Contributors include speech synthesis specialists from 16 countries, with experience in the development of systems for 12 European languages. This volume emerges from a four-year European COST project focussed on "The Naturalness of Synthetic Speech", and will be a valuable text for everyone involved in speech synthesis.
    • Table of Contents

      List of Contributors.

      Preface.

      PART I: ISSUES IN SIGNAL GENERATION.

      Towards Greater Naturalness: Future Directions of Research in Speech Synthesis (Keller, E.).

      Towards More Versatile Signal Generation Systems (Bailly, G).

      A Parametric Harmonic + Noise Model (Bailly, G.).

      The COST 258 Signal Generation Test Array (Bailly, G.).

      Concatenative Text-to-Speech Synthesis Based on Sinusoidal Modelling (Banga, E.R. et al).

      Shape Invariant Pitch and Time-Scale Modification of Speech Based on a Harmonic Model (O'Brien, D. & Monaghan, A.).

      Concatenative Speech Synthesis Using SRELP (Rank, E.).

      PART II: ISSUES IN PROSODY.

      Prosody in Synthetic Speech: Problems, Solutions and Challenges (Monaghan, A.).

      State-of-the-Art Summary of European Synthetic Prosody R&D (Monaghan,A.).

      Modelling F0 Contour in Various Romance Languages: Implementation in Some TTS Systems (Martin, P.).

      Acoustic Characterisation of the Tonic Syllable in Portuguese (Teixeira, J.P. and Freitas, D.).

      Prosodic Parameter of Synthetic Czech: Developing Rules for Duration and Intensity (Dohalska, M. et al).

      MFGI, a Linguistically Motivated Quantitative Model of German Prosody (Mixdorff, H.).

      Improvements in Modelling the FO Contour for Different Types of Intonation Units in Slovene (Dobnikar, A.).

      Representing Speech Rhythm (Keller, B.Z. and Keller, E.).

      Phonetic and Timing Considerations in a Swiss High German TTS System (Siebenhaar, B. et al).

      Corpus-based Development of Prosodic Models Across Six Languages (Fackrell, J. et al).

      Vowel Reduction in German Read Speech (Widera, C.).

      PART III: ISSUES IN STYLES OF SPEECH.

      Variability and Speaking Styles in Speech Synthesis (Terken, J.).

      An Auditory Analysis of the Prosody of Fast and Slow Speech Styles in English, Dutch and German (Monaghan, A.).

      Automatic Prosody Modelling of Galician and its Application to Spanish (Gonzalo, E.L. et al).

      Reduction and Assimilatory Processes in Conversational French Speech: Implications for Speech Synthesis (Duez, D.).

      Acoustic Patterns of Emotions (Pollermann, B.Z. and Archinard, M).

      The Role of Pitch and Tempo in Spanish Emotional Speech: Towards Concatenative Synthesis (Montero, J.M. et al).

      Voice Quality and the Synthesis of Affect (Chasaide, A.N. and Gobl, C.).

      Prosodic Parameters of a 'Fun' Speaking Style(Gustafson, K. and House, D.).

      Dynamics of the Glottal Source Signal: Implications for Naturalness in Speech Synthesis (Gobl, C. and Chasaide, A.N.).

      A Nonlinear Rhythmic Components in Various Styles of Speech (Keller, B.Z. ad Keller, Ec.).

      PART IV: ISSUES IN SEGMENTATION AND MARK-UP.

      Issues in Segmentation and Mark-UP (Huckvale, M.).

      The Use and Potential of Extensible Mark-UP (XML) in Speech Generation (Huckvale, M.).

      Mark-Up for Speech Synthesis: A Review and Some Suggestions (Monaghan, A.).

      Automatic Analysis of Prosody for Multi-lingual Speech Corpora (Hirst,D.).

      Automatic Speech Segmentation Based on Alignment with a Text-to-Speech System (Horak, P.).

      Using the COST 249 Reference Speech Recogniser for Automatic Speech Segmentation (Warakagoda, N.D. and Natvig, J.E.).

      PART V: FUTURE CHALLENGES.

      Future Challenges (Keller, E.).

      Towards Naturalness, or the Challenge of Subjectivenss (Caerlen-Haumont, G.).

      Synthesis within Multi-Modal Systems (Breen,

      商品描述(中文翻譯)

      合成語音中的自然度是當今資訊技術中最棘手的問題之一。儘管過去20年來語音合成系統有了很大的改進,但它們很少完全聽起來像人類說話者。
      為什麼會這樣,以及可以做些什麼呢?
      - 韻律處理必須更加多樣化且更適合語音情境。
      - 需要更多關注時間、旋律控制以及各種韻律參數之間的關係。
      - 必須開發和完善能夠從數據庫中生成多種聲音的信號處理系統。
      - 需要更好地理解區分不同聲音的特點,以及朗讀數字和句子與交互式語音使用之間的語音風格差異。
      - 應該開發新的評估方法,以客觀和主觀的方式測量合成語音的可懂度和對聽眾的認知負荷。
      - 需要提出並在實際情況下測試適用於多種語言的適當文本標記系統。
      - 需要進一步研究將語音合成系統整合到更大的自然語言處理系統中。
      《語音合成的改進》介紹了上述領域的最新研究成果。貢獻者包括來自16個國家的語音合成專家,他們在12種歐洲語言的系統開發方面具有豐富經驗。本書是歐洲COST項目“合成語音的自然度”四年研究的成果,對於所有從事語音合成的人士都是一本寶貴的參考書籍。

      《目錄》
      - 貢獻者名單
      - 前言
      - 第一部分:信號生成問題
      - 第二部分:韻律問題
      - 第三部分:語音風格問題
      - 附錄