Machine Reading Comprehension for Multiclass Questions on Thai Corpus

Theerit Lapchaicharoenkit

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/70356

Title:	Machine Reading Comprehension for Multiclass Questions on Thai Corpus
Other Titles:	การอ่านทำความเข้าใจด้วยเครื่องสำหรับคำถามหลายประเภทบนคลังข้อความภาษาไทย
Authors:	Theerit Lapchaicharoenkit
Advisors:	Peerapon Vateekul
Other author:	Chulalongkorn University. Faculty of Engineering
Advisor's Email:	Peerapon.V@Chula.ac.th
Subjects:	Machine learning Neural networks (Computer science) Questions and answers การเรียนรู้ของเครื่อง นิวรัลเน็ตเวิร์ค (วิทยาการคอมพิวเตอร์) คำถามและคำตอบ
Issue Date:	2019
Publisher:	Chulalongkorn University
Abstract:	Previous Thai question answering and machine reading comprehension researches focus on small scale dataset and do not utilize the deep learning approach to build the models. In this research, we develop a Thai machine reading comprehension (MRC) model on Thai MRC dataset provided by NECTEC. This dataset consists of 17,000 question-answer pairs and has two classes of questions, which are factoid and yes-no questions. We use BIDAF as the based MRC architecture. We have performed experiments with 3 different multiclass model designs, which includes special tokens, joint, and cascade model. We also utilize contextual embeddings for Thai language to enhance the model’s performance. As the results suggest that cascade architecture has the best F1 performance. We then incorporate transfer learning and modify the attention mechanisms to increase the model’s accuracy on yes-no questions.
Other Abstract:	งานวิจัยที่เกี่ยวข้องกับการถามตอบและการอ่านทำความเข้าใจก่อนหน้านี้นั้นถูกทำบนชุดข้อมูลที่มีขนาดค่อนข้างเล็กและไม่ได้มีการใช้การเรียนรู้เชิงลึกเข้ามาช่วยในการสร้างแบบจำลอง ในงานวิจัยครั้งนี้ผู้วิจัยได้ทำการสร้างแบบจำลองการอ่านทำความเข้าใจบนชุดข้อมูลการทำความเข้าในภาษาไทยจากศูนย์เทคโนโลยีอิเล็กทรอนิกส์และคอมพิวเตอร์แห่งชาติ (NECTEC) ชุดข้อมูลดังกล่าวมีจำนวนคู่คำถาม คำตอบทั้งหมด 17,000 คู่ด้วยกัน โดยที่คู่คำถามคำตอบสามารถแบ่งได้เป็น 2 ประเภทด้วยกันคือคำถามข้อเท็จจริง และ คำถามตอบรับหรือปฏิเสธ ผู้วิจัยได้ใช้แบบจำลอง BIDAF เป็นแบบจำลองหลักในการทำงานวิจัย ผู้วิจัยได้ทำการทดลองกับโครงสร้างแบบจำลองสำหรับการตอบคำถามหลายประเภท 3 รูปแบบโครงสร้างด้วยกันได้แก่ แบบคำพิเศษ (special token) แบบจำลองร่วมกัน (joint) และแบบจำลองแบบแยก (cascade) ผู้วิจัยได้ทำการใช้เวกเตอร์คำที่คำนึงถึงบริบท (contextual embedding) เพื่อเพิ่มประสิทธิภาพของแบบจำลอง หลังจากที่ผู้วิจัยพบว่าแบบจำลองแบบแยก (cascade) มีประสิทธิภาพที่ดีที่สุด ผู้วิจัยได้ทำการใช้การส่งต่อการเรียนรู้ (transfer learning) และทำการดัดแปลงกลไกการสนใจ (attention mechanism) เพื่อเพิ่มความสามารถของแบบจำลองบนคำถามแบบตอบรับหรือปฏิเสธ
Description:	Thesis (M.Sc.)--Chulalongkorn University, 2019
Degree Name:	Master of Science
Degree Level:	Master's Degree
Degree Discipline:	Computer Science
URI:	http://cuir.car.chula.ac.th/handle/123456789/70356
URI:	http://doi.org/10.58837/CHULA.THE.2019.168
metadata.dc.identifier.DOI:	10.58837/CHULA.THE.2019.168
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
6170932121.pdf		2.52 MB	Adobe PDF	View/Open

Show full item record