การปรับปรุงการเข้ารหัสคำทับศัพท์ภาษาไทย/อังกฤษ เพื่อการค้นคืนข้ามภาษาโดยการตัดพยางค์ของรหัสเสียง

โอภาส วงษ์ทวีทรัพย์

dc.contributor.advisor	บุญเสริม กิจศิริกุล
dc.contributor.advisor	สมชาย ประสิทธิ์จูตระกูล
dc.contributor.author	โอภาส วงษ์ทวีทรัพย์
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
dc.date.accessioned	2018-01-04T07:36:23Z
dc.date.available	2018-01-04T07:36:23Z
dc.date.issued	2549
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/56708
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2549	en_US
dc.description.abstract	นำเสนอการค้นคืนข้ามภาษา สำหรับคำทับศัพท์ภาษาไทย/อังกฤษ โดยได้ใช้วิธีการของนิวรอลเน็ตเวิร์กในการเข้ารหัสคำ และใช้ขั้นตอนการตัดพยางค์ของรหัสเสียง วิธีการที่นำเสนอช่วยให้สามารถค้นคืนคำทับศัพท์ข้ามภาษาได้ โดยไม่ต้องอาศัยพจนานุกรม ในการค้นคืนข้ามภาษาโดยไม่อาศัยพจนานุกรมนั้น จำเป็นต้องใช้หลักการเข้ารหัสซึ่งเป็นสัญลักษณ์แทนเสียงอ่านของคำ และประกอบด้วยรหัสเสียงของแต่ละอักษรของคำมาเรียงต่อกัน ในการที่จะทราบว่าตัวอักษรที่กำลังสนใจในคำนั้นให้รหัสเสียงใด จำเป็นต้องอาศัยการพิจารณาตัวอักษรข้างเคียงด้วย ดังนั้นการเข้ารหัสคำสามารถจัดได้ว่าเป็นปัญหาการจำแนกอย่างหนึ่ง ด้วยเหตุนี้จึงได้นำวิธีการนิวรอลเน็ตเวิร์กมาใช้ในการเข้ารหัสคำ แต่เนื่องจากว่ารหัสคำของคำไทยและอังกฤษที่มีเสียงอ่านตรงกัน อาจมีความแตกต่างกันบ้าง จึงได้ใช้ขั้นตอนการเปรียบเทียบแบบประมาณสำหรับการค้นคืนคำที่มีเสียงอ่านคล้ายกันมากที่สุด จากผลการทดลองด้วยวิธี K-fold cross validation พบว่าเมื่อได้ปรับปรุงนิวรอลเน็ตเวิร์ก สามารถให้ผลการค้นคืนในแบบที่ 1 ด้วยตัววัด F1 เป็น 83.28% สำหรับกรณีคำไทยทับศัพท์คำอังกฤษและให้ผลการค้นคืน F1 90.54% สำหรับคำอังกฤษทับศัพท์คำไทยที่ค่าความแตกต่างของรหัสเสียงเป็น 0	en_US
dc.description.abstractalternative	To present Thai/English cross-language transliterated world retrieval by using neural networks and syllable segmentation of phonetic codes. The proposed method enables the transliterated word retrieval without using the dictionary. Without dictionary, the phonetic code is employed for cross-language retrieval. The phonetic code of a word represents the sound of the word and it consists of a sequence of phonetic codes of characters in the word. In order to determine the code of a particular character, it is necessary to consider its surrounding characters. Hence this problem can be identified as a classification problem. For this reason, neural networks are used in phonetic encoding. However, as the codes generated from a pair of corresponding Thai/English words are sometimes slightly different, the approximate string matching is applied to determine of character editing. The experimental results, using K-fold cross validation, show that the F1-measure values are 83.28% for Thai/English cross-language transliterated and 90.54% for English/Thai cross-language transliterated with zero distance between phonetic codes.	en_US
dc.language.iso	th	en_US
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.relation.uri	http://doi.org/10.14457/CU.the.2006.1073
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.subject	Thai language -- Pronunciation	en_US
dc.subject	English language -- Pronunciation	en_US
dc.subject	Cross-language information retrieval	en_US
dc.subject	Neural networks ‪(Computer sciences)‬	en_US
dc.subject	ภาษาไทย -- การออกเสียง	en_US
dc.subject	ภาษาอังกฤษ -- การออกเสียง	en_US
dc.subject	การค้นคืนสารสนเทศข้ามภาษา	en_US
dc.subject	นิวรัลเน็ตเวิร์ค (วิทยาการคอมพิวเตอร์)	en_US
dc.title	การปรับปรุงการเข้ารหัสคำทับศัพท์ภาษาไทย/อังกฤษ เพื่อการค้นคืนข้ามภาษาโดยการตัดพยางค์ของรหัสเสียง	en_US
dc.title.alternative	Improvement of Thai/English transliterated word encoding for cross-language retrieval by syllable segmentation of phonetic codes	en_US
dc.type	Thesis	en_US
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	en_US
dc.degree.level	ปริญญาโท	en_US
dc.degree.discipline	วิทยาศาสตร์คอมพิวเตอร์	en_US
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.email.advisor	boonserm.k@chula.ac.th
dc.email.advisor	Somchai.P@Chula.ac.th
dc.identifier.DOI	10.14457/CU.the.2006.1073