การศึกษาการแยกความหมายของคำหลายความหมายในภาษาไทยโดยใช้วิธีการวิเคราะห์ความหมายแอบแฝง

นัชชา ถิระสาโรช

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/58076

Title:	การศึกษาการแยกความหมายของคำหลายความหมายในภาษาไทยโดยใช้วิธีการวิเคราะห์ความหมายแอบแฝง
Other Titles:	A STUDY OF WORD SENSE DISCRIMINATION IN THAI USING LATENT SEMANTIC ANALYSIS
Authors:	นัชชา ถิระสาโรช
Advisors:	วิโรจน์ อรุณมานะกุล
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะอักษรศาสตร์
Advisor's Email:	Wirote.A@Chula.ac.th,awirote@gmail.com
Issue Date:	2559
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	วิทยานิพนธ์ฉบับนี้มีวัตถุประสงค์เพื่อพัฒนาระบบการแยกนัยความหมายของคำหลายนัยความหมายโดยใช้แนวทางการวิเคราะห์ความหมายแอบแฝง และศึกษาเปรียบเทียบประสิทธิภาพของระบบในการแยกนัยความหมายของคำหลายนัยความหมายที่อยู่ในประเภทของคำเดียวกันและต่างกัน รวมถึงระบุบริบทที่เหมาะสมในการแยกนัยความหมายของคำที่ใช้ศึกษา คำที่ใช้ศึกษาได้แก่ คำนาม คือ คำว่า เสียง และ หัว และคำกริยา คือ คำว่า บอก และ ติด โดย เสียง และ บอก เป็นตัวแทนของคำที่มีนัยความหมายน้อย และ หัว และ ติด เป็นตัวแทนของคำที่มีนัยความหมายมาก ในการศึกษาได้ใช้คำบริบทตำแหน่งต่าง ๆ ในการช่วยแยกความหมาย ผลการศึกษาพบว่า ระบบที่ใช้คำบริบทที่อยู่ติดกับคำเป้าหมายและมีกรอบหน้าต่างไม่มากจะสามารถแยกความหมายได้ดีกว่าระบบที่ใช้กรอบหน้าต่างมาก และระบบสามารถแยกความหมายของคำที่มีนัยความหมายน้อยได้ดีกว่าคำที่มีนัยความหมายมาก เนื่องจากการกระจายตัวของข้อมูลมีน้อยกว่า ส่วนประเภทของคำที่ต่างกันนั้นไม่มีผลต่อประสิทธิภาพของระบบ และตำแหน่งของบริบทไม่มีผลต่อการแยกความหมายของคำนามและคำกริยา เนื่องจากงานวิจัยนี้ใช้เพียงรูปคำเท่านั้น ดังนั้นสิ่งที่ใช้ช่วยระบบในการแยกความหมายเป็นหลักคือคำบริบทที่ปรากฏร่วมกับคำเป้าหมายนั้น ๆ เป็นประจำและมีความถี่สูง ในส่วนของค่าความถูกต้องของระบบในงานวิจัยนี้ยังไม่ดีเท่าใดนัก ได้ความถูกต้องระหว่าง x-y สาเหตุน่าจะมากจากการใช้เพียงรูปคำ และจำนวนข้อมูลที่ใช้มีไม่มาก
Other Abstract:	The main purpose of this study is to develop Thai word sense discrimination system using Latent Semantic Analysis as well as comparing the performance of systems when discriminating senses of polysemous words in the same and different word class and indicating the appropriate context help distinguishing the senses. Words used in this study are nouns, /sieng4/ ‘sound’ and /hua4/ ‘head’, and verbs, /bɔk1/ ‘tell’ and /tɪt1/ ‘stick’. /sieng4/ and /bɔk1/ are the representations of few-meaning words whereas /hua4/ and /tɪt1/ are the representations of multi-meaning words. Contexts are the clues that help discriminating the senses in this study. The results show that the systems using the small window size tend to distinguish the word senses better than those using the large window size. Moreover, the systems can discriminate the senses of few-meaning words better than those of multi-meaning words because of a small dispersion of data. The different part of speech does not affect the performance of the systems as well as context position does not have an effect on discriminating the senses of nouns and verbs. Since this study uses only word forms, the key that help discriminating the senses is words that co-occur with the target words high frequently. The results of word sense discrimination is this study is not good. Accuracy is ranged from x to y. This could cause from the use of only word forms and the low number of training data.
Description:	วิทยานิพนธ์ (อ.ด.)--จุฬาลงกรณ์มหาวิทยาลัย, 2559
Degree Name:	อักษรศาสตรดุษฎีบัณฑิต
Degree Level:	ปริญญาเอก
Degree Discipline:	ภาษาศาสตร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/58076
URI:	http://doi.org/10.58837/CHULA.THE.2016.721
metadata.dc.identifier.DOI:	10.58837/CHULA.THE.2016.721
Type:	Thesis
Appears in Collections:	Arts - Theses

Files in This Item:

File	Description	Size	Format
5480510622.pdf		4.16 MB	Adobe PDF	View/Open

Show full item record