การรู้จำเสียงพูดไทยโดยตรงจากการเข้ารหัส G.729

สิริ วงศ์วรชาติกาล

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/72271

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	สุวิทย์ นาคพีระยุทธ	-
dc.contributor.author	สิริ วงศ์วรชาติกาล	-
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์	-
dc.date.accessioned	2021-02-12T07:27:38Z	-
dc.date.available	2021-02-12T07:27:38Z	-
dc.date.issued	2543	-
dc.identifier.isbn	9741301111	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/72271	-
dc.description	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2543	en_US
dc.description.abstract	มาตรฐาน ITU-T G.729 เป็นมาตรฐานในการบีบอัดเสียงพูดซึ่งสามารถนำใช้งานได้อย่างกว้างขวาง ดังนั้นถ้าเราสามารถดึงจุดเด่นของเสียงที่จำเป็นในการรู้จำออกมาได้โดยตรงจากรหัสเสียงที่ถูกบีบอัดแล้ว จะสามารถสร้างระบบรู้จำเสียงอย่างง่ายจากรหัสเสียง G.729 โดยตรง พลังงานเสียง คาบการสั่นของเสียง และ LSP (Line Spectral Pair) เป็นพารามิเตอร์ที่ส่งมาลับรหัส G.729 และสามารถใช้ในการรู้จำเสียงได้ วิทยานิพนธ์นี้นำวิธีการของแบบจำลองฮิดเดน มาร์คอฟ และการควอนไทซ์แบบเวกเตอร์ มาใช้ในการรู้จำเสียงภาษาไทยแบบไม่ขึ้นกับผู้พูด คำศัพท์ทั้งหมด 30 คำแบ่งเป็น 2 ชุดได้แก่ ชุดคำศัพท์ตัวเลข 0 ถึง 9 และชุดคำศัพท์พยางค์เดียว 20 คำ เสียงพูดที่นำมาเป็นต้นแบบและเป็นเสียงพูดทดสอบประกอบด้วยทั้งเพศชายและหญิงที่มีช่วงอายุ ระหว่าง 18 ปี ถึง 25ปี ผลการทดสอบอัตราการรู้จำแบบไม่ขึ้นลับผู้พูดของชุดเลียงพูดเพื่อทดสอบมีอัตรารู้จำเฉลี่ยร้อยละ 90.75 โดยมีอัตราการรู้จำเฉพาะชุดคำศัพท์พยางค์เดียวร้อยละ 88.50 อัตราการรู้จำเฉพาะชุดตัวเลขร้อยละ 93.00 ตามลำดับ	en_US
dc.description.abstractalternative	The ITU-T Recommendation G.729 is a versatile and well accepted speech compression standard. If the speech feature can be extracted directly from the code easily, a simple speech recognition system can work directly on the G.729 codes. Energy, pitch period and LSP are the parameters obtained from G.729 codes which can be used in speech recognition. This thesis uses Hidden Markov Model (HMM) and Vector Quantization to recognize speaker independent Thai speech. The 30-word vocabulary is subdivided into two sets comprising 20 single syllable, and 10 tha. numeric words, zero to nine. The separated speech training set and testing set are composed of both male and female speakers within the range of 18 to 25 years of age. The average recognition rate of this speaker-independent recognition system is 90.75 %. The recognition rate of the single-syllabled words is 88.50 %.The recognition rate of the numeric words is 93.00%.	en_US
dc.language.iso	th	en_US
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.subject	การรู้จำเสียงพูดอัตโนมัติ	en_US
dc.title	การรู้จำเสียงพูดไทยโดยตรงจากการเข้ารหัส G.729	en_US
dc.title.alternative	Direct recognition of Thai speech from G.729 code	en_US
dc.type	Thesis	en_US
dc.degree.name	วิศวกรรมศาสตรมหาบัณฑิต	en_US
dc.degree.level	ปริญญาโท	en_US
dc.degree.discipline	วิศวกรรมไฟฟ้า	en_US
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.email.advisor	Suvit.N@Chula.ac.th	-
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Siri_wo_front_p.pdf	หน้าปก สารบัญ และบทคัดย่อ	779.89 kB	Adobe PDF	View/Open
Siri_wo_ch1_p.pdf	บทที่ 1	665.27 kB	Adobe PDF	View/Open
Siri_wo_ch2_p.pdf	บทที่ 2	1.3 MB	Adobe PDF	View/Open
Siri_wo_ch3_p.pdf	บทที่ 3	1.38 MB	Adobe PDF	View/Open
Siri_wo_ch4_p.pdf	บทที่ 4	760.17 kB	Adobe PDF	View/Open
Siri_wo_ch5_p.pdf	บทที่ 5	628.45 kB	Adobe PDF	View/Open
Siri_wo_back_p.pdf	บรรณานุกรมและภาคผนวก	1.19 MB	Adobe PDF	View/Open

Show simple item record