การประยุกต์ใช้การวิเคราะห์ความหมายแฝงกับการจำแนกประเภทอารมณ์ในข้อความภาษาไทย

ปิยธิดา อินทร์รักษ์

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/18347

Title:	การประยุกต์ใช้การวิเคราะห์ความหมายแฝงกับการจำแนกประเภทอารมณ์ในข้อความภาษาไทย
Other Titles:	Applying latent semantic analysis to classification of emotions in Thai text
Authors:	ปิยธิดา อินทร์รักษ์
Advisors:	สุกรี สินธุภิญโญ
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	sukree@cp.eng.chula.ac.th
Subjects:	ภาษาไทย -- อรรถศาสตร์ ภาษาไทย -- คำกริยาวิเศษณ์ การวิเคราะห์กลุ่มแฝง
Issue Date:	2552
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	ในยุคที่การติดต่อสื่อสารข้อมูลผ่านเครือข่ายอินเทอร์เน็ตเติบโตขึ้นอย่างต่อเนื่อง ข้อมูลประเภทตัวอักษรก็ถูกผลิตขึ้นมาเป็นจำนวนมากเช่นกัน ข้อมูลเหล่านี้สามารถถูกถ่ายทอดออกมาและจำแนกหมวดหมู่ของตัวอักษรได้ การจำแนกด้านอารมณ์ก็เป็นอีกหัวข้อที่น่าสนใจในปัจจุบัน แต่การจำแนกด้านอารมณ์จากตัวอักษรภาษาไทยนั้นยังไม่มีประสิทธิภาพที่ดีพอ หัวข้อวิจัยนี้ได้แบ่งการจำแนกประเภทอารมณ์จากข้อความสั้นภาษาไทยออกมาเป็น 6 อารมณ์สากลพื้นฐาน ได้แก่ โกรธ ขยะแขยง กลัว มีความสุข เศร้า และประหลาดใจ ซึ่งอ้างอิงจากข้อมูลการวิจัย ในการวิจัยนี้ได้เปรียบเทียบผลลัพธ์ของ 2 ตัวแบบที่สร้างมาจากประโยครูปแบบต่างๆ และประยุกต์ใช้กับ 3 ระเบียบวิธีได้แก่นาอีฟเบส์ (Naive Bayes), เครื่องจักรเวกเตอร์สนับสนุน (Support Vector Machine, SVM) และต้นไม้ตัดสินใจ (Decision Tree) โดยตัวแบบที่หนึ่งใช้การจำแนกโดยการวิเคราะห์ความหมายแฝงของคำเดี่ยว ส่วนตัวแบบที่สองใช้การประยุกต์การวิเคราะห์ความหมายแฝงของคำคู่ที่มักปรากฏคู่กันร่วมกับระนาบความหมายของคำเดี่ยว ผลการเปรียบเทียบผลลัพธ์แสดงให้เห็นว่า ตัวแบบที่สองให้ความถูกต้องได้สูงกว่าตัวแบบที่หนึ่ง อ้างอิงจากระเบียบวิธีการจำแนกของนาอีฟเบส์ที่ให้ผลสูงกว่าระเบียบวิธีการอื่น
Other Abstract:	With a rapid growth of the internet communication, many types of text are produced. They can convey the meanings that can contribute to text categorization. Moreover, emotion classification becomes more interesting, but emotion classification in Thai text is still not able to be correctly classified. Thus, this paper proposes a novel approach that takes advantage of bi-words occurrence to classify emotion hidden in a short sentence. In this paper, we classify Thai text into six basic universal emotions including anger, disgust, fear, happiness, sadness, and surprise based on Latent Semantic Analysis (LSA) approach. We compared the results between two models which construct features from the sentences and applied both models to three classification methods, i.e. Naive Bayes, SVM, and Decision Tree. The first feature model uses only single word occurrence in the classification. The second model uses single word combined with bi-words occurrence in the classification. The results show that the second model yielded higher accuracy than the first model based on the Naive Bayes classification method.
Description:	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2552
Degree Name:	วิทยาศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิทยาศาสตร์คอมพิวเตอร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/18347
URI:	http://doi.org/10.14457/CU.the.2009.307
metadata.dc.identifier.DOI:	10.14457/CU.the.2009.307
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Piyatida_In.pdf		1.9 MB	Adobe PDF	View/Open

Show full item record