การฝึกปรปักษ์เสมือนด้วยการรบกวนแบบถ่วงน้ำหนักโทเค็นในการจัดประเภทข้อความ

ธีรพงศ์ แซ่ลิ้ม

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/80380

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	สุรณพีร์ ภูมิวุฒิสาร	-
dc.contributor.author	ธีรพงศ์ แซ่ลิ้ม	-
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะพาณิชยศาสตร์และการบัญชี	-
dc.date.accessioned	2022-07-25T02:24:44Z	-
dc.date.available	2022-07-25T02:24:44Z	-
dc.date.issued	2564	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/80380	-
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2564	-
dc.description.abstract	การจัดประเภทข้อความ (Text classification) เป็นกระบวนการคัดแยกข้อความให้เป็นหมวดหมู่อย่างถูกต้อง ตัวแบบจำลองการฝึกอบรมล่วงหน้าโดยใช้ตัวเข้ารหัสแบบสองทิศจากทรานฟอร์เมอร์ หรือเรียกว่า BERT ช่วยทำให้ตัวแบบจำลองเรียนรู้บริบทของคำแบบสองทิศทาง ส่งผลให้สามารถจัดประเภทข้อความได้อย่างมีประสิทธิภาพและแม่นยำ ถึงแม้ว่าตัวแบบจำลอง BERT และตัวแบบจำลองที่เกิดขึ้นจากสถาปัตยกรรมนี้ จะสามารถจัดการงานด้านการประมวลผลทางธรรมชาติได้อย่างยอดเยี่ยม แต่กลับพบว่าตัวแบบจำลองนี้ยังพบเจอปัญหา Overfitting กล่าวคือ เมื่ออยู่ในสถานการณ์ที่ชุดข้อมูลในการฝึกอบรมมีจำนวนตัวอย่างน้อย ตัวแบบจำลอง BERT จะให้ความสนใจไปที่คำบางคำมากเกินไปจนไม่สนใจบริบทของประโยค จนทำให้ตัวแบบจำลองไม่สามารถทำนายข้อมูลในชุดการทดสอบได้ถูกต้อง ซึ่งส่งผลในประสิทธิของตัวแบบจำลองลดลง ดังนั้นในงานวิทยานิพนธ์ฉบับนี้จึงเสนอแนวทาง วิธีการฝึกปรปักษ์เสมือนด้วยการรบกวนแบบถ่วงน้ำหนักโทเค็น ซึ่งรวมการรบกวนสองระดับเข้าด้วยกัน ได้แก่ การรบกวนระดับประโยค และการรบกวนแบบถ่วงน้ำหนักโทเค็น เพื่อสร้างการรบกวนที่มีความละเอียดกว่าการฝึกปรปักษ์เสมือนแบบดั้งเดิม ที่อาศัยเพียงการรบกวนระดับประโยคเท่านั้น วิธีการนี้จะช่วยให้ตัวแบบจำลองสามารถเรียนรู้โทเค็นที่สำคัญในประโยค จากการทดลองบนเกณฑ์มาตรฐานการประเมินความเข้าใจภาษาทั่วไป (GLUE) แสดงให้เห็นว่าวิธีการที่นำเสนอสามารถเพิ่มประสิทธิภาพของตัวแบบจำลองโดยได้คะแนนเฉลี่ยร้อยละ 79.5 ซึ่งมีประสิทธิภาพเหนือกว่าตัวแบบจำลอง BERT และสามารถแก้ไขปัญหา Overfitting ในชุดข้อมูลขนาดเล็ก	-
dc.description.abstractalternative	Text Classification is the process of classifying text into categories. Among its contextualized architecture proposed, pretraining Bidirectional Encoder Representations from Transformers (BERT) helps models learn the bidirectional context of words, making it possible to classify text much more efficiently and accurately. Although BERT and its variance have led to impressive gains on many natural language processing (NLP) tasks, one of the problems of BERT is the overfitting problem. When training data is limited, BERT model overemphasizes certain words and ignores the context of the sentence. This makes it difficult for the model to make accurate predictions on the test data. We propose virtual adversarial training with the weighted token perturbation, which combines two-level perturbations: (1) sentence-level perturbation and (2) the weighted token perturbation to create a more granular perturbation than traditional virtual adversarial training with only sentence-level perturbation. Our approach can help models learn more about the key and important tokens in sentences when trained with virtual adversarial examples. The experiments in the General Language Understanding Evaluation (GLUE) benchmark showed that our approach can achieve the average score of 79.5%, which outperforms BERTbase model and reduce the overfitting problem on small datasets.	-
dc.language.iso	th	-
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.relation.uri	http://doi.org/10.58837/CHULA.THE.2021.1057	-
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.subject.classification	Computer Science	-
dc.subject.classification	Mathematics	-
dc.subject.classification	Computer Science	-
dc.subject.classification	Computer Science	-
dc.title	การฝึกปรปักษ์เสมือนด้วยการรบกวนแบบถ่วงน้ำหนักโทเค็นในการจัดประเภทข้อความ	-
dc.title.alternative	Virtual adversarial training with weighted token perturbation in text classification	-
dc.type	Thesis	-
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	-
dc.degree.level	ปริญญาโท	-
dc.degree.discipline	สถิติ	-
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.identifier.DOI	10.58837/CHULA.THE.2021.1057	-
Appears in Collections:	Acctn - Theses

Files in This Item:

File	Description	Size	Format
6380157926.pdf		2.2 MB	Adobe PDF	View/Open

Show simple item record