การถอดคำแบบถ่ายเสียงสำหรับชื่อบุคคลภาษาไทยที่เขียนด้วยอักษรโรมัน

ชุลีกร กิตติกูล

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/30442

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	อติวงศ์ สุชาโต	-
dc.contributor.advisor	โปรดปราน บุณยพุกกณะ	-
dc.contributor.author	ชุลีกร กิตติกูล	-
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์	-
dc.date.accessioned	2013-04-02T08:23:15Z	-
dc.date.available	2013-04-02T08:23:15Z	-
dc.date.issued	2554	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/30442	-
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2554	en
dc.description.abstract	การถอดคำแบบถ่ายเสียงสำหรับแต่ละคำสามารถสร้างได้จากกฎ หรือใช้แบบจำลองทางสถิติ หรือค้นจากพจนานุกรม อย่างไรก็ตามการขาดมาตรฐานและความหลายหลากของการแปลงชื่อบุคคลไทยให้เป็นชื่อที่เขียนด้วยอักษรโรมันเป็นงานที่ท้าทาย และแม้ว่าวิธีที่ใช้พจนานุกรมเหมือนจะให้ผลที่ค่อนข้างถูกต้องที่สุด แต่ส่วนของการแปลงตัวอักษรเป็นเสียงก็ยังมีความจำเป็นสำหรับคำที่ไม่พบในพจนานุกรม งานวิจัยนี้เสนอวิธีการถอดคำแบบถ่ายเสียงสำหรับชื่อบุคคลภาษาไทยที่เขียนด้วยอักษรโรมันให้เป็นเสียงภาษาไทย โดยคำนึงถึงความนิยมในการใช้งาน ชื่อบุคคลภาษาไทยที่เขียนด้วยอักษรโรมันจะถูกแบ่งให้เป็นสายลำดับของแกรมโดยใช้พจนานุกรมแกรมสะสมซึ่งถูกสร้างจากชื่อมากกว่า 130,000 ชื่อ ผลการศึกษาพบว่าวิธีนี้ให้ความถูกต้องของผลลัพธ์ที่ 93 % และ 95 % โดยวัดจากคะแนนความเห็นของการยอมรับได้ เมื่อคำที่ถอดคำแบบถ่ายเสียงถูกสร้างจากสายลำดับที่เป็นไปได้ทั้งหมด ด้วยการไม่ถ่วงน้ำหนักแกรมภาษาไทย และด้วยการถ่วงน้ำหนักแกรมภาษาไทยตามลำดับ และเมื่อใช้การจับคู่คำแบบยาวที่สุด จะได้ความถูกต้องที่ 73% และ 77% เมื่อใช้การไม่ถ่วงน้ำหนักแกรมภาษาไทยและการถ่วงน้ำหนักแกรมภาษาไทยตามลำดับ	en
dc.description.abstractalternative	A transcription of each word can either be produced by rules, statistical models, or retrieved from dictionary. However, the lack of standards and the variation of how a Thai person romanizes his or her name pose transcription a challenging task. Although the dictionary-based approach seems to produce the most accurate result, a letter-to-sound conversion module is necessary for unknown names. We propose an approach to transcribe romanized Thai person names into Thai sounds which considers the popularity of usage. The romanized Thai names are parsed into sequences of grams, utilizing the Gram lexicon, built from a corpus of more than 130,000 names. The results show 93 and 95% mean opinion score of acceptability when the transcriptions are generated from all possible sequences with unweighted and weighted Thai grams respectively. When longest-match model is used, the acceptability levels are 73 and 77% for unweighted and weighted Thai grams.	en
dc.format.extent	1511441 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	th	es
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.relation.uri	http://doi.org/10.14457/CU.the.2011.1149	-
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.subject	ภาษาไทย -- การถอดตัวอักษร	en
dc.title	การถอดคำแบบถ่ายเสียงสำหรับชื่อบุคคลภาษาไทยที่เขียนด้วยอักษรโรมัน	en
dc.title.alternative	Generating transcriptions for romanized Thai person names	en
dc.type	Thesis	es
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	es
dc.degree.level	ปริญญาโท	es
dc.degree.discipline	วิทยาศาสตร์คอมพิวเตอร์	es
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.email.advisor	Atiwong.S@Chula.ac.th	-
dc.email.advisor	Proadpran.Pu@Chula.ac.th	-
dc.identifier.DOI	10.14457/CU.the.2011.1149	-
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
chuleekorn_ki.pdf		1.48 MB	Adobe PDF	View/Open

Show simple item record