พารามิเตอร์ทางเสียงสำหรับการจำแนกลักษณะการเปล่งเสียงในเสียงพูดต่อเนื่องภาษาไทย

วิทยา โรจน์กิตติเจริญ

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/31730

Title:	พารามิเตอร์ทางเสียงสำหรับการจำแนกลักษณะการเปล่งเสียงในเสียงพูดต่อเนื่องภาษาไทย
Other Titles:	Acoustic parameters for manner of articulation classification in Thai continuous speech
Authors:	วิทยา โรจน์กิตติเจริญ
Advisors:	อติวงศ์ สุชาโต โปรดปราน บุณยพุกกณะ
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	Atiwong.S@Chula.ac.th Proadpran.P@Chula.ac.th
Subjects:	ภาษาไทย -- สัทศาสตร์ เสียงพูด การรู้จำเสียงพูดอัตโนมัติ
Issue Date:	2554
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	ในการพัฒนาระบบรู้จำเสียงแบบอื่นเช่น ระบบรู้จำเสียงแบบแลนมาร์ค จะต้องทำการหาตำแหน่งของแลนมาร์ค ของเสียงที่เราให้ความสนใจ เช่นตำแหน่งของเสียงพยัญชนะ หรือตำแหน่งของเสียงสระ เป็นต้น เพื่อใช้เป็นข้อมูลขาเข้าในการรู้จำเสียงพูด ดังนั้นเป้าหมายงานวิทยานิพนธ์นี้จึง ได้เน้นไปที่การจำแนกลักษณะการเปล่งเสียงในเสียงพูดต่อเนื่องภาษาไทย เพื่อสามารถนำไปใช้ในการพัฒนาระบบรู้จำเสียงพูดแบบแลนแลนมาร์คได้ โดยที่งานวิทยานิพนธ์นี้ได้ทำการปรับปรุงชุดพารามิเตอร์ทางเสียงสำหรับเพื่อให้เหมาะสมกับภาษาไทย ซึ่งประกอบด้วยโดยได้ปรับให้มีการใช้ 1) จุดศูนย์ถ่วงของสเปกตรัม 2) อัตราการตัดศูนย์ในช่วงเวลา 3) อัตราส่วนพลังงานในช่วงความถี่ [0-400] Hz ต่อ พลังงานในช่วงความถี่ [400-6000] Hz เพิ่มเติม จากผลการทดลองจำแนกสมบัติทางสมบัติทางสวนสัทศาสตร์ แสดงให้เห็นว่ามีความผิดพลาดในการจำแนกสมบัติทางสมบัติทางสวนสัทศาสตร์ ลดลง 28.09%, 11.0%, 2.41% สำหรับการจำแนกสมบัติทางสวนสัทศาสตร์ [คอนทินิวแอนท์], [ซิลลาบิค] และ [ไซเรนท์] ตามลำดับ เมื่อทำการเปรียบเทียบกับ ชุดพารามิเตอร์ทางเสียงที่ใช้ในการจำแนกสมบัติทางสวนสัทศาสตร์สำหรับเสียงภาษาอังกฤษ และเมื่อทำการตัดแบ่งเสียงเพื่อทำการหาตำแหน่งเสียงพยัญชนะ และ เสียงสระ พบว่าได้ความถูกต้องในการตัดแบ่ง 80.46% โดยมีความผิดพลาดในการตัดแบ่งลดลง 23.46% เมื่อเทียบกับระบบอ้างอิงที่ใช้การรู้จำเสียงพูดแบบอาศัยแบบจำลองฮิดเดนมาร์คอฟ ในการทดลองสุดท้ายพบว่าเมื่อทำการเทียบผลการรู้จำในระดับพยางค์ ในรูปแบบ พยัญชนะต้น-สระ-ตัวสะกด ระบบที่เสนอกับระบบอ้างอิงให้ความถูกต้องในระดับเดียวกัน
Other Abstract:	In landmark-based speech recognition system. We need to locate the landmark of speech such a consonant landmark or a vowel landmark. For using that kind of landmark as an input data to speech recognition system. This thesis focuses on finding broad manner class of Thai speech. For developing the landmark-based speech recognition system This thesis is aimed at the improvement of the acoustic parameters for the Thai automatic speech recognition system. We proposed acoustic parameters that capture the characteristics of broad manner class of Thai speech. These acoustic parameters are: 1) spectral center of gravity 2) short time zero crossing rate to 3) the energy ratio E[0-400] to E[400-6000]. The results showed 28.09%, 11.0% and 2.41% error reductions for the continuant, the syllabic and the silence features, respectively, when compared to acoustic parameters used in English. The accuracy of 80.46% was obtained from the speech segmentation task and also introduced a 23.46% error reduction when compared to the baseline HMM-MFCC based broad class segmentation. We also found similar performance for word classification in the CVC context when compared to the baseline HMM-MFCC in word recognition tasks.
Description:	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2554
Degree Name:	วิศวกรรมศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิศวกรรมคอมพิวเตอร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/31730
URI:	http://doi.org/10.14457/CU.the.2011.291
metadata.dc.identifier.DOI:	10.14457/CU.the.2011.291
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
wittaya_ro.pdf		1.79 MB	Adobe PDF	View/Open

Show full item record