การสังเคราะห์พยางค์เสียงหนักและพยางค์เสียงเบาในภาษาไทย

นัฐพล พานสมบัติ, 2520-

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/1291

Title:	การสังเคราะห์พยางค์เสียงหนักและพยางค์เสียงเบาในภาษาไทย
Other Titles:	Synthesis of stressed and unstressed syllable in Thai language
Authors:	นัฐพล พานสมบัติ, 2520-
Advisors:	เอกชัย ลีลารัศมี สุดาพร ลักษณียนาวิน
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	Ekachai.L@chula.ac.th Sudaporn.L@chula.ac.th
Subjects:	ภาษาไทย -- พยางค์ ภาษาไทย -- หน่วยเสียง การสังเคราะห์เสียง
Issue Date:	2545
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	วิทยานิพนธ์นี้นำเสนอวิธีการสังเคราะห์พยางค์เสียงหนักและเบาในภาษาไทย โดยการเปลี่ยนสัทลักษณะได้แก่ ช่วงเวลา ความถี่มูลฐาน และแอมพลิจูดของสัญญาณเสียง ซึ่งจะช่วยให้เสียงสังเคราะห์ฟังเป็นธรรมชาติมากขึ้น การปรับช่วงเวลาและความถี่มูลฐานของสัญญาณเสียงจะใช้วิธีทีดี-โซลา (Time-Domain Pitch-Synchronous Overlap Add : TD-PSOLA) โดยที่การปรับช่วงเวลาจะใช้การเพิ่มหรือลดสัญญาณช่วงสั้นเพื่อให้ได้ระยะเวลาของพยางค์ตามการลงเสียงหนักเบาและโครงสร้างของหน่วยจังหวะ ในการปรับความถี่มูลฐานจะปรับระยะห่างระหว่างยอดพิทช์ระบุตามรูปแบบความถี่มูลฐานในฐานข้อมูลความถี่มูลฐานสำหรับพยางค์เสียงเบาที่ได้สร้างไว้ทั้งหมด 14 รูปแบบตามเสียงวรรณยุกต์และโครงสร้างของพยางค์ สำหรับการปรับขนาดแอมพลิจูดของสัญญาณเสียงปรับได้ โดยคูณสัญญาณเสียงด้วยอ้ตราส่วนแอมพลิจูดระหว่างพยางค์เสียงหนักและพยางค์เสียงเบาที่ได้สร้างไว้เป็นฐานข้อมูลตามเสียงสระทั้งหมด 24 หน่วย การประเมินคุณภาพเสียงที่สังเคราะห์ตามวิธีในวิทยานิพนธ์นี้ทำโดยอาสาสมัครจำนวน 10 คน ได้ค่าเอ็มโอเอส (Most Opinion Score : MOS) สำหรับการปรับลักษณะทางสัทศาสตร์ในระดับคำเท่ากับ 3.67 และในระดับประโยค 3.92
Other Abstract:	This thesis presents the stressed and unstressed syllables synthesis method by modifying acoustic characteristics consisting of duration, fundamental frequency and sound amplitude in order to make the synthesized speech sound more naturally. Time Domain Pitch Synchronous Overlapped Add (TD-PSOLA) is used for modifying duration and fundamental frequency. Duration can be expanded or compressed by creating or eliminating short time signals to derive the desired syllable duration which depends on its rhythmic unit structure. Fundamental frequency (FO) of speech can be modified according to 14 patterns of unstressed syllable fundamental frequency, which are classified by syllable tone and syllable structure. This modification is by manipulating of the duration among the consecutive pitch marks. Amplitude modification is performed by multiplying the speech signal by the amplitude ratio between unstressed and stressed syllables, which are separated by syllable vowels into 24 units. The speech quality of this synthesis method was assessed by 10 volunteers. The results of assessments have MOS (Mean Opinion Score) is 3.67 for acoustic modification in word and 3.92 for acoustic modification in sentence.
Description:	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2545
Degree Name:	วิศวกรรมศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิศวกรรมไฟฟ้า
URI:	http://cuir.car.chula.ac.th/handle/123456789/1291
ISBN:	9741797494
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Nattapol.pdf		2.3 MB	Adobe PDF	View/Open

Show full item record