An acoustic study of syllable rhymes : a basis for Thai continuous speech recognition system

Ekkarit Maneenoi

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/5947

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Somchai Jitapunkul	-
dc.contributor.advisor	Sudaporn Luksaneeyanawin	-
dc.contributor.advisor	Chularat Tanprasert	-
dc.contributor.author	Ekkarit Maneenoi	-
dc.contributor.other	Chulalongkorn University. Faculty of Engineering	-
dc.date.accessioned	2008-02-22T04:20:44Z	-
dc.date.available	2008-02-22T04:20:44Z	-
dc.date.issued	2003	-
dc.identifier.isbn	9741741715	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/5947	-
dc.description	Thesis (D.Eng.)--Chulalongkorn University, 2003	en
dc.description.abstract	The objective of this dissertation is to develop a new speech unit on acoustic modeling of the Thai language. The Thai syllables were studied in both acoustical and phonological properties. From the acoustical point of view, in the syllable structure, the final consonant is strongly influenced by the vowel duration. This relationship occurs only between the vowel and the final consonant. In contrase, the initial consonant is not affected by the duration of the vowel. Hence, the vowel and the final consonant are tightly tied while an inital consonant is loosely tied with the vowel in the syllable. From a phonological point of view, a syllable is composed of a pair of an onset and a rhyme unit. The onset consists of an initial consonant and its transition towards the following vowel. Along with the onset, the rhyme is composed of a vowel, a final consonant, and a tone. The onset-rhyme not only includes its contextual information, but also embeds the language modeling at the syllable level. Consequently, the decomposition of the syllable into an onset and rhyme is appropriate to the Thai language. The whole set of Thai syllables can be recognized by identifying onsets and rhymes. This research has objective to compare the efficiency of the units. Therefore a tone recognition system is not implemented in this research. To evaluate the effectiveness of the proposed acoustic model, various conventional speech units used in speech recognition systems have been investigated. Several experiments have been carried out to find the proper speech unit that can accurately create acoustic model and give a higher recognition rate. Results of recognition rates under different acoustic models are given and compared. The speech corpus used for training in this experiment was recorded from 9 male and 11 female speakers. This group of speakers also produced the speaker-dependent test set. In addition, the speaker-independent test set was produced from other 5 male and 5 female speakers. This speech corpus was designed to cover all onset-rhyme units in the Thai language. Experimental results show that the onset-rhyme model improves on the efficiency of other speech units. The onset-rhyme model improves on the accuracy of the baseline monophone model, the inter-syllable triphone model, and the context-dependent Initial-Final model by nearly 26.2%, 6.4%, and 4.2% for the speaker-dependent systems using only an acoustic model, and 29.7%, 6.0%, and 4.2% for the speaker-dependent systems using both acoustic and language model respectively. Using the language model, the onset accuracy is increased by around 16-21% for both SD and SI systems. In addition, the accuracy of the rhyme is substantially improved by nearly 45-47% for the SD and SI systems when the language model is applied. The results show that the onset-rhyme models attain a high recognition rate. Moreover, they also give more efficiency in terms of system complexity.	en
dc.description.abstractalternative	วิทยานิพนธ์เล่มนี้มีวัตถุประสงค์ของงานวิจัยเพื่อพัฒนาหน่วยเสียงเชิงกลสัทศาสตร์สำหรับแบบจำลองหน่วยตามพยางค์ภาษาไทย งานวิจัยนี้ทำการศึกษาคุณลักษณะของพยางค์ในภาษาไทยทั้งเชิงกลสัทศาสตร์และระบบเสียงภาษา โครงสร้างของพยางค์ในภาษาไทยมีคุณลักษณะในเชิงกลสัทศาสตร์ที่สระมีผลกระทบอย่างมากต่อความยาวของพยัญชนะตัวสะกด ความสัมพันธ์ดังกล่าวนี้จะเกิดขึ้นระหว่างสระและพยัญชนะตัวสะกดเท่านั้น ส่วนความยาวของพยัญชนะต้นจะไม่มีผลกระทบจากสระ จากคุณลักษณะดังกล่าวสามารถสรุปได้ว่าพยัญชนะตัวสะกดมีความสัมพันธ์กันอย่างมาก ในเชิงระบบเสียงภาษาพยางค์ประกอบด้วยคู่ของหน่วยเริ่มพยางค์และหน่วยตามพยางค์ โดยที่หน่วยเริ่มพยางค์ประกอบด้วยพยัญชนะต้นและส่วนที่เปลี่ยนจากพยัญชนะไปสู่สระ ส่วนหน่วยตามพยางค์ประกอบด้วยสระ พยัญชนะตัวสะกดและวรรณยุกต์ หน่วยเริ่มพยางค์และหน่วยตามพยางค์นอกจากจะมีข้อมูลเชิงบริบทของสระแล้วยังมีการจำลองเชิงภาษาไว้ในระดับพยางค์อีกด้วย ดังนั้นการจำแนกพยางค์ออกเป็นสองส่วนคือ หน่วยเริ่มพยางค์และหน่วยตามพยางค์จึงมีความเหมาะสมสำหรับภาษาไทย เนื่องจากงานวิจัยนี้มีวัตถุประสงค์ในการเปรียบเทียบประสิทธิภาพของแบบจำลองหน่วยเสียงประเภทต่าง ๆ ดังนั้นงานวิจัยนี้จึงไม่พัฒนาระบบรู้จำวรรณยุกต์ด้วย หน่วยเสียงประเภทต่าง ๆ ที่ใช้ในระบบรู้จำเสียงพูดถูกนำมาประเมินผลเปรียบเทียบกับหน่วยเสียงที่นำเสนอ ในงานวิจัยนี้มีการทดลองจำนวนมากเพื่อที่จะค้นหาหน่วยเสียงที่สามารถจำลองคุณลักษณะเชิงกลสัทศาสตร์ได้เหมาะสมและให้ผลการรู้จำที่ดีที่สุดโดยผลการรู้จำเสียงพูดจากหน่วยเสียงประเภทต่าง ๆ จะถูกนำมาเสนอและเปรียบเทียบฐานข้อมูลเสียงพูดสำหรับใช้ฝึกฝนในงานวิจัยนี้บันทึกจากผู้พูดเพศชายจำนวน 9 คน และเพศหญิงจำนวน 11 คน กลุ่มผู้พูดชุดนี้จะบันทึกเสียงพูดทดสอบแบบขึ้นกับผู้พูดอีกด้วย สำหรับเสียงพูดทดสอบแบบไม่ขึ้นกับผู้พูดจะได้จากการบันทึกเสียงพูดของผู้พูดเพศชายจำนวน 5 คน และเพศหญิงจำนวน 5 คน อีกกลุ่มหนึ่ง ฐานข้อมูลเสียงพูดดังกล่าวนี้ได้รับการออกแบบให้ครอบคลุมหน่วยเริ่มพยางค์และหน่วยตามพยางค์ทั้งหมดที่มีอยู่ในภาษาไทย จากผลการทดลองแสดงให้เห็นว่าแบบจำลองหน่วยเริ่มพยางค์และหน่วยตามพยางค์มีประสิทธิภาพที่ดีกว่าหน่วยเสียงประเภทอื่น ๆ อัตราการรู้จำของแบบจำลองหน่วยเริ่มพยางค์และหน่วยตามพยางค์มีประสิทธิภาพที่ดีกว่าแบบจำลองพื้นฐาน monophone, inter-syllable triphone และ context-dependent Initial-Final ร้อยละ 26.2, 6.4 และ 4.2 สำหรับระบบรู้จำแบบขึ้นกับผู้พูดโดยใช้การจำลองเชิงกลสัทศาสตร์และการจำลองเชิงภาษา การใช้การจำลองเชิงภาษาทำให้อัตราการรู้จำของหน่วยเริ่มสูงขั้นประมาณร้อยละ 16-21 สำหรับระบบรู้จำแบบขึ้นกับผู้พูดแลแบบไม่ขึ้นกับผู้พูด นอกจากนี้อัตราการรู้จำของหน่วยตามพยางค์ถูกปีบปรุงขึ้นอย่างมากประมาณร้อยละ 45-47 สำหรับระบบรู้จำแบบขึ้นกับผู้พูดและแบบไม่ขึ้นกับผู้พูดเมื่อมีการใช้การจำลองเชิงภาษา ผลการทดลองแสดงให้เห็นว่าแบบจำลองหน่วยเริ่มพยางค์และหน่วยตามพยางค์มีอัตราการรู้จำที่สูงมาก นอกจากนี้แบบจำลองหน่วยเสียงดังกล่าวยังมีประสิทธิภาพที่ดีในด้านคววามซับซ้อนอีกด้วย	en
dc.format.extent	6078187 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	en	es
dc.publisher	Chulalongkorn University	en
dc.rights	Chulalongkorn University	en
dc.subject	Automatic sppech recognition	en
dc.subject	Thai language -- Phonetics	en
dc.subject	Hidden Markov Model	en
dc.title	An acoustic study of syllable rhymes : a basis for Thai continuous speech recognition system	en
dc.title.alternative	การศึกษาหน่วยตามของพยางค์เชิงกลสัทศาสตร์ : พื้นฐานสำหรับระบบการรู้จำเสียงพูดต่อเนื่องภาษาไทย	en
dc.type	Thesis	es
dc.degree.name	Doctor of Philosophy	es
dc.degree.level	Doctoral Degree	es
dc.degree.discipline	Electrical Engineering	es
dc.degree.grantor	Chulalongkorn University	en
dc.email.advisor	Somchai.J@chula.ac.th	-
dc.email.advisor	Sudaporn.L@chula.ac.th	-
dc.email.advisor	Chularat.T@chula.ac.th	-
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
EkkaritM.pdf		5.94 MB	Adobe PDF	View/Open

Show simple item record