การรู้จำเสียงพูดต่อเนื่องภาษาไทยโดยใช้นิวรอลเน็ตเวิร์ก

ประเสริฐศักดิ์ ผุงประเสริฐยิ่ง

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/19235

Title:	การรู้จำเสียงพูดต่อเนื่องภาษาไทยโดยใช้นิวรอลเน็ตเวิร์ก
Other Titles:	Thai continuous speech recognition using neural networks
Authors:	ประเสริฐศักดิ์ ผุงประเสริฐยิ่ง
Advisors:	บุญเสริม กิจศิริกุล
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	boonserm@cp.eng.chula.ac.th, Boonserm.K@chula.ac.th
Subjects:	การรู้จำเสียงพูดอัตโนมัติ นิวรัลเน็ตเวิร์ค (คอมพิวเตอร์)
Issue Date:	2550
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	งานวิจัยชิ้นนี้มีจุดมุ่งหมายเพื่อพัฒนาระบบรู้จำเสียงพูดต่อเนื่องอัตโนมัติภาษาไทย โดยใช้นิวรอลเน็ตเวิร์กรู้จำหน่วยเสียงในกรอบการวิเคราะห์ระดับเฟรม แล้วจึงนำผลการรู้จำนี้ประกอบกับแบบจำลองทางภาษาและกระบวนการค้นหา จนได้ลำดับของคำในภาษาออกมาเป็นผลลัพธ์ จากนั้นทำการวิเคราะห์ประสิทธิภาพของระบบโดยใช้ฐานข้อมูลเสียงพูดชื่อไทย และฐานข้อมูลเกี่ยวกับสัตว์ภาษาไทย โดยทดลองปรับค่าพารามิเตอร์ต่างๆ คือ ชุดหน่วยเสียง อันดับของพีแอลพี และจำนวนเฟรมที่ใช้ แล้วแสดงความถูกต้องของการรู้จำ ทั้งในระดับเฟรม และในระดับคำ ทั้งในชุดข้อมูลสำหรับการเรียนรู้ และในชุดข้อมูลสำหรับการทดสอบ ในชุดข้อมูลสำหรับการทดสอบ ฐานข้อมูลเสียงพูดชื่อไทยมีความถูกต้องสูงสุดระดับเฟรมอยู่ที่ประมาณ 70% และระดับคำอยู่ที่ประมาณ 90% ฐานข้อมูลเสียงพูดเกี่ยวกับสัตว์ภาษาไทยมีความถูกต้องสูงสุดระดับเฟรมอยู่ที่ประมาณ 60% และระดับคำอยู่ที่ประมาณ 40%
Other Abstract:	The purpose of this research is to develop an automatic Thai continuous speech recognition system by applying neural networks to frame-based recognition of phonemes. The recognition results are then combined with the language model and the search process to provide the sequence of words as an outcome. The system performance has been analyzed with Thai First Names Speech Corpus and Thai Animal Speech Corpus. The experiments are performed by adjusting the system parameters which are the phoneme set, the PLP order and the number of frames. We present the recognition accuracy at the frame level and the word level, both in the training set and the test set. For the test set of the Thai First Names Speech Corpus, the system achieves about 70% and 90% maximum accuracy in the frame level and the word level respectively, while for that of the Thai Animal Speech Corpus, the system provides about 60% and 40% maximum accuracy in the frame level and the word level respectively.
Description:	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2550
Degree Name:	วิศวกรรมศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิศวกรรมคอมพิวเตอร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/19235
URI:	http://doi.org/10.14457/CU.the.2007.1321
metadata.dc.identifier.DOI:	10.14457/CU.the.2007.1321
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Prasertsak_pu.pdf		1.46 MB	Adobe PDF	View/Open

Show full item record