การประสานเวลาอัตโนมัติแบบทันทีระหว่างเสียงและข้อความ

ณัฏฐ์ เลิศวงศ์คณากูล

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/44227

Title:	การประสานเวลาอัตโนมัติแบบทันทีระหว่างเสียงและข้อความ
Other Titles:	Real-Time automatic Speech-Text Alignment
Authors:	ณัฏฐ์ เลิศวงศ์คณากูล
Advisors:	โปรดปราน บุณยพุกกณะ อติวงศ์ สุชาโต
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	proadpran.p@chula.ac.th atiwong.s@chula.ac.th
Subjects:	การถอดเสียง ระบบแปลงเสียงเป็นข้อความ ระบบประมวลผลเสียงพูด การรู้จำเสียงพูดอัตโนมัติ ภาษาศาสตร์คอมพิวเตอร์ Transcription Speech-to-text systems Speech processing systems Automatic speech recognition Computational linguistics
Issue Date:	2555
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	การประสานเวลาอัตโนมัติระหว่างเสียงและข้อความนั้น เป็นวิธีการที่แสดงเนื้อหาเดียวกันจากสื่อที่แตกต่างกัน ซึ่งในที่นี้คือเสียงและข้อความ ซึ่งโปรแกรมประยุกต์ส่วนใหญ่จะเป็นการประสานเวลาในระดับประโยค และใช้ข้อมูลของเสียงและข้อความทั้งหมดในการประสานเวลา แต่เนื่องด้วยความต้องการของโปรแกรมประยุกต์บางประเภท เช่น โปรแกรมการสร้างหนังสือเสียงซึ่งมีข้อความทั้งหมด และต้องการที่จะประสานเวลาในทันทีที่เสียงเข้ามาในระบบ อย่างไรก็ตาม ด้วยลักษณะของภาษาไทยซึ่งมีการแบ่งประโยคและคำไม่ชัดเจน ทำให้การประสานเวลานั้นมีความท้าทาย ดังนั้นวิทยานิพนธ์นี้จึงเสนอขั้นตอนวิธีในการประสานเวลาอัตโนมัติแบบทันทีระหว่างเสียงและข้อความในระดับพยางค์ ขั้นตอนวิธีที่นำเสนอนั้นใช้หลักการในการตรวจหาพยางค์และตรวจหาความไม่ตรงกันของการถอดเสียง การทดลองได้ศึกษาการใช้ลักษณะเด่นต่าง ๆ และการปรับค่าพารามิเตอร์อย่างละเอียด ขั้นตอนวิธีที่นำเสนอถูกนำมาเปรียบเทียบกับระบบอ้างอิง 2 ระบบ ซึ่งได้ผลลัพธ์ดีกว่าระบบอ้างอิง 75% และ 41% ตามลำดับ และในแง่ของเวลาสามารถคำนวณได้ในทันที
Other Abstract:	Most of the researches in synchronization of audio and text have been focusing on the synchronization at the level of utterance. However, to generate audio books in unstructed language like Thai from live speech, a finer lever of synchronization is necessary. We propose an algorithm to synchronize live speech with its corresponding transcription in real time at syllabic unit. The proposed algorithm employs the syllable detection concept and the transcription errors detection concept. The experiment was studied the features and the parameters empirically. The result were compared with 2 baselines and found that the proposed algorithm was better than 2 baselines 75% and 41% respectively. In term of processing time, the proposed algorithm was able to give the results in real-time.
Description:	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2555
Degree Name:	วิศวกรรมศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิศวกรรมคอมพิวเตอร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/44227
URI:	http://doi.org/10.14457/CU.the.2012.436
metadata.dc.identifier.DOI:	10.14457/CU.the.2012.436
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Nat_le.pdf		2.21 MB	Adobe PDF	View/Open

Show full item record