A study of prosodic features for Indonesian speech recognition

Nazrul Effendy

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/13649

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Somchai Jitapunkul	-
dc.contributor.author	Nazrul Effendy	-
dc.contributor.other	Chulalongkorn University. Faculty of Engineering	-
dc.date.accessioned	2010-10-14T08:18:29Z	-
dc.date.available	2010-10-14T08:18:29Z	-
dc.date.issued	2006	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/13649	-
dc.description	Thesis (D.Eng.)--Chulalongkorn University, 2006	en
dc.description.abstract	Utterance-type information has been used been used in spoken dialogue system, speech recognition system and translation machine. In a typical spoken dialogue system, a user can ask question or give information to the system. In another side, the spoken dialogue system should be capable of recognizing its user intention to give the correct response to him/her. In this dissertation, the automatic utterance-type recognizer is proposed to distinguish declarative questions from statements in Indonesian speech. Since utterances in these two types have the same words with the same order and differ only in their intonations, their classification requires not only a word recognizer, but also an intonation recognizer. At first, the utterance-type recognizer is designed based on Fujisaki model. The utterance-type recognizer uses a combination of the Fujisaki-model-parameters as the features to recognizt the two utterance type. The best performance of the Fujisaki model based utterance-type recognizer is achieved using a combination of a fraction value of F[subscript b] : F[subscript b]/100 the amplitude of last accent command, and the magnitude of last phrase command as the input of the neural neetworks. However, the Fujisaki parameters extractor is too complicated to be implemented in an automatic recognition system. Therefore, the utterance-type recognizer is developed using the polynomial coefficients of the pitch contours of the sentence's final word. The automatic utterance-type recognizer using polynomial expansion consists of a pitch contour extractor, normalizer, feature extractor, classifier, and an automatic utterance segmentation module. The pitch contour of each utterance type i analyzed to investigate the final word of the two utterance type. To create the automatic utterance segmentation module, an Indonesian acoustic model is designed. The evaluation confirms that the method using the final word and polynomial expansion is effective to distinguish declarative questions and statements in Indonesian speech.	en
dc.description.abstractalternative	ข้อมูลชนิด Utterance ถูกใช้ใน spoken dialogue system ระบบการรู้จำเสียงพูด และ translation machine โดยทั่วไปใน spoken dialogue system ผู้ใช้สามารถถามคำถามหรือให้ข้อมูลกับเครื่องได้ spoken dialogue system จึงควรจะสามารถรู้จำผู้ใช้ได้เพื่อให้สามารถตอบสนองได้อย่างถูกต้อง วิทยานิพนธ์ฉบับนี้นำเสนอ the automatic utterance type recognizer เพื่อให้สามารถแยกแยะ declarative questions ออกจาก statements ในภาษาอินโดนีเซียได้ เนื่องจาก Utterance ในเสียงพูดทั้งสองชนิดมีลักษณะด้านคำและลำดับที่เหมือนกัน และแตกต่างกันในเฉพาะด้าน intonations การแยกแยะจึงต้องใช้ทั้งตัวรู้จักคำ และตัวรู้จำ intonations ตัวรู้จำชนิด Utterance ถูกออกแบบบนพื้นฐานของแบบจำลองฟูจิซากิ ตัวรู้จำดังกล่าวจะใช้ผลรวมของค่าพารามิเตอร์จากแบบจำลองฟูจิซากิในการรู้จำ Utterance ในเสียงพูดทั้งสองชนิด ประสิทธิภาพสูงสุดของแบบจำลองฟูจิซากิเมื่อนำมาใช้ในการรู้จำ Utterance ได้จากผลรวมของค่า fraction value เท่ากับ F[subscript b]/100 ค่าแอมพิจูดของ accent command และค่าขนาดของ last phrase command เป็นสัญญาณข้าวของระบบ neural อย่างไรก็ตามการดึงค่าพารามิเตอร์มาใช้งานของแบบจำลองฟูจิซากิ มีความซับซ้อนมากเกินกว่าจะสามารถนำมาใช้ได้ในระบบการรู้จำเสียงพูดแบบอัตโนมัติ ดังนั้นตัวรู้จำชนิด Utterance จึงถูกพัฒนาโโยใช้ค่าสัมประสิทธิ์ polynomial ของคอนทัวร์ความทุ้มแหลมของเสียง (Pitch contours) ของคำสุดท้ายแต่ละประโยค ตัวรู้จำชนิด Utterance แบบอัตโนมัติซึ่งใช้การแผ่ขยาย polynomial ประกอบด้วยการดึงคอนทัวร์ความทุ้มแหลมของเสียง และมอดูลการแบ่ง Utterance แบบอัตโนมัติอนทัวร์ความทุ้มแหลมของเสียงของ Utterance ในเสียงพูดแต่ละชนิดจะถูกวิเคราะห์เพื่อศึกษาถึงคำสุดท้ายของแต่ละชนิด Utterance นอกจากนี้โมเดลเสียงอินโดนีเซียได้ถูกออกแบบขึ้นเพื่อให้สามารถสร้างมอดูลการแบ่ง Utterance แบบอัตโนมัติได้ ผลการประเมินระบบแสดงว่าการใช้ข้อมูลคำสุดท้าย และการแผ่ขยาย polynomial สามารถแยกแยะ declarative questions ออกจาก statements ในภาษาอินโดนีเซียได้อย่างมีประสิทธิภาพ	en
dc.format.extent	1944990 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	en	es
dc.publisher	Chulalongkorn University	en
dc.relation.uri	http://doi.org/10.14457/CU.the.2006.1769	-
dc.rights	Chulalongkorn University	en
dc.subject	Automatic speech recognition	en
dc.subject	Indonesian language	en
dc.title	A study of prosodic features for Indonesian speech recognition	en
dc.title.alternative	การศึกษาคุณลักษณะสัทสัมพันธ์สำหรับการรู้จำเสียงพูดอินโดนีเซีย	en
dc.type	Thesis	es
dc.degree.name	Doctor of Engineering	es
dc.degree.level	Doctoral Degree	es
dc.degree.discipline	Electrical Engineering	es
dc.degree.grantor	Chulalongkorn University	en
dc.email.advisor	Somchai.J@chula.ac.th	-
dc.identifier.DOI	10.14457/CU.the.2006.1769	-
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Nazrul_Ef.pdf		1.9 MB	Adobe PDF	View/Open

Show simple item record