การจำแนกหน่วยเสียงภาษาไทยโดยใช้การแทนเสียงแบบทั้งเซกเมนต์

หนึ่งฤทัย เอกชัยวรสิน

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/46688

Title:	การจำแนกหน่วยเสียงภาษาไทยโดยใช้การแทนเสียงแบบทั้งเซกเมนต์
Other Titles:	Thai phoneme classification using segmental representation
Authors:	หนึ่งฤทัย เอกชัยวรสิน
Advisors:	อติวงศ์ สุชาโต โปรดปราน บุณยพุกกณะ
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	Atiwong.S@Chula.ac.th Proadpran.P@Chula.ac.th
Subjects:	การรู้จำเสียงพูดอัตโนมัติ ภาษาไทย -- หน่วยเสียง Automatic speech recognition Thai language -- Phonemics
Issue Date:	2549
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	ในปัจจุบันนี้การหาลักษณะเฉพาะแบบอาศัยกรอบเวลานั้น ถือได้ว่าเป็นวิธีการที่มีความนิยมสูง แต่อย่างไรก็ตามหลักการทำงานดังกล่าว ยังคงมีสมมติฐานบางอย่างที่ยังไม่เหมาะสมกับธรรมชาติของเสียงพูด รวมไปถึงมีข้อจำกัดบางประการในการหาสมบัติของสัญญาณเสียงใดๆ ซึ่งปัญหาดังกล่าว สามารถแก้ไขด้วยหลักการทำงานแบบอาศัยเซกเมนต์ ที่มีการแบ่งสัญญาณเสียงออกเป็นเซกเมนต์ของหน่วยเสียงแทนการแบ่งตามกรอบเวลา ดังนั้นวิทยานิพนธ์นี้จึงได้นำเสนอหลักการหาลักษณะเฉพาะแบบอาศัยเซกเมนต์ ซึ่งได้จากการหาค่าสัมประสิทธิ์เมลฟรีเคว็นซีเคปตรอล 12 หลักรวมกับค่าพลังงาน 1 หลัก จากส่วนหน้า ส่วนกลางและส่วนหลังของเซกเมนต์ต่อกันรวม 39 ร่วมกับการใช้สมบัติความยาวของเซกเมนต์รวมเป็นลักษณะเฉพาะทั้งสิ้น 40 หลัก เมื่อทดสอบความสามารถในการแทนเสียงของลักษณะเฉพาะด้วยการจำแนกหน่วยเสียงภาษาไทย 52 หน่วยเสียง โดยใช้การวิเคราะห์ดิสคริมิแนนต์เชิงเส้น และให้ค่า โอกาสของการตอบหน่วยเสียงใดๆ เท่ากัน ผลการจำแนกหน่วยเสียงที่ได้คือ 61.41 เปอร์เซ็นต์ และเมื่อให้ค่าโอกาสของการตอบหน่วยเสียงใดๆ ตามจำนวนหน่วยเสียงที่ใช้ฝึกฝนความถูกต้องที่ได้อยู่ที่ 66.14 เปอร์เซ็นต์ เมื่อเปรียบเทียบประสิทธิภาพความถูกต้องกับการจำแนกหน่วยเสียงแบบอาศัยกรอบเวาลา ซึ่งมีความถูกต้องเฉลี่ย 56.95 เปอร์เซ็นต์ พบว่ามีความถูกต้องสูงกว่าถึง 9.19 เปอร์เซ็นต์ จากการวิเคราะห์ค่าความมีส่วนร่วมต่อการจำแนกหน่วยเสียงของลักษณะเฉพาะที่นำเสนอ พบว่าค่าความมีส่วนร่วมของส่วนกลางนั้น มีค่าเกือบครึ่งหนึ่งของการจำแนกหน่วยเสียงทั้งหมดผลดังกล่าวสนับสนุนแนวความคิดในการหาลักษณะเฉพาะที่มีการแบ่งเซกเมนต์ออกเป็น 3 ส่วน ที่กล่าวว่าส่วนกลางของเซกเมนต์สามารถแทนหน่วยเสียงได้ดีที่สุด
Other Abstract:	Nowadays, spectral feature extraction at a fixed frame rate is a highly popular technique for representing speech signals. However, some assumptions used by this technique are not suitable for natural speech. Also, the technique has various limitations in acquiring some types of acoustic properties. To avoid these limitations, a segmental representation method separates an acoustic speech signal into small segments according to the underlying phonemes before performing the feature extraction. In this thesis, an approach for extracting feature vectors using segmental representation has been proposed. By means of this approach, a 40-dimensional feature vector, consisting of 12 Mel Frequency Cepstral coefficients and an energy feature of three regions: the frontal region of the segment, and the rear region of the segment, together with the segment duration, is used to represent a speech segment, In the experiments, where 52 Thai phonemes are classified using Linear Discriminant Analysis, the classification accuracy is 66.14% when prior probabilities are used while it is 61.41% without prior probabilities. The best accuracy obtained using our segment-based approach is 9.19% higher than the one using a baseline frame-based approach, which is 56.95%. In addition, it is found as a result of our contribution analysis that features extracted from the middle region contribute the most to the classification.
Description:	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2549
Degree Name:	วิศวกรรมศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิศวกรรมคอมพิวเตอร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/46688
URI:	http://doi.org/10.14457/CU.the.2006.1476
metadata.dc.identifier.DOI:	10.14457/CU.the.2006.1476
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Nuengruethai.pdf		1.65 MB	Adobe PDF	View/Open

Show full item record