Natural language processing for digital advertising

Yiping Jin

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/79815

Title:	Natural language processing for digital advertising
Other Titles:	การประมวลผลภาษาธรรมชาติสำหรับการโฆษณาดิจิทัล
Authors:	Yiping Jin
Advisors:	Dittaya Wanvarie
Other author:	Chulalongkorn University. Faculty of Science
Issue Date:	2021
Publisher:	Chulalongkorn University
Abstract:	Advertising is not only a marketing or sales activity but a particular form of two-way communication. In this thesis, we propose to apply the two main subtasks of natural language processing (NLP), namely natural language understanding (NLU) and natural language generation (NLG), to digital advertising to enhance the effectiveness of advertising. We apply weakly-supervised text classification to rapidly build text classifiers for contextual advertising (Jin et al. 2022). The method requires a handful of labeled keywords instead of a large corpus of labeled documents and can be easily transferred to new domains. We further evaluate the weakly-supervised models using unsupervised error estimation and perform automatic keyword selection (Jin et al., 2021a). Unsupervised error estimation is essential because no labeled development dataset is available in real-world problems where weakly-supervised text classification methods are applied. Finally, we tap on a state-of-the-art sequence-to-sequence Transformer model to generate cohesive and diverse advertising slogans from a short company description (Jin et al., In press). We prevent the model from hallucinating unsupported information using entity masking and generate diverse and catchy slogans using conditional training.
Other Abstract:	การโฆษณานั้นไม่ได้เป็นเพียงกิจกรรมการตลาดหรือการขาย แต่เป็นการสื่อสารสองทางรูปแบบหนึ่ง ในวิทยานิพนธ์นี้ ผู้วิจัยนำเสนอการประยุกต์งานการประมวลผลภาษาธรรมชาติ (natural language processing) 2 งาน ได้แก่ การเข้าใจภาษาธรรมชาติ (natural language understanding) และ การสังเคราะห์ภาษาธรรมชาติ (natural language generation) กับงานโฆษณาดิจิทัล เพื่อเพิ่มประสิทธิผลในการโฆษณา ผู้วิจัยประยุกต์ใช้การจำแนกข้อความแบบมีผู้สอนเล็กน้อยเพื่อให้สร้างตัวแบบจำแนกข้อความสำหรับการโฆษณาโดยอิงบริบทได้อย่างรวดเร็ว (Jin et al. 2022) วิธีนี้ต้องใช้การกำกับคำสำคัญเพียงเล็กน้อย แทนที่จะใช้คลังข้อความขนาดใหญ่ที่มีการกำกับชนิดของเอกสาร นอกจากนี้ วิธีนี้ยังสามารถนำปรับไปใช้กับโดเมนใหม่ๆ ได้ง่ายอีกด้วย ผู้วิจัยยังประเมินผลตัวแบบซึ่งมีผู้สอนเล็กน้อยโดยใช้การประมาณค่าผิดพลาดแบบไม่มีผู้สอน และเลือกคำสำคัญแบบอัตโนมัติ (Jin et al. 2021a) การประมาณค่าผิดพลาดแบบไม่มีผู้สอนนั้นจำเป็น เนื่องจากเมื่อใช้วิธีการจำแนกข้อความแบบมีผู้สอนเล็กน้อยในสถานการณ์จริงจะไม่มีชุดข้อมูลที่มีการกำกับผลลัพธ์ ตัวแบบทรานส์ฟอร์เมอร์ (Transformer) เป็นตัวแบบบที่ดีที่สุดในการแปลงข้อความเป็นข้อความ ผู้วิจัยใช้ตัวแบบทรานส์ฟอร์เมอร์ในการสร้างคำโฆษณาที่เกี่ยวข้องและมีความหลากลายจากคำอธิบายสั้นๆ ของบริษัท (Jin et al., In press) ผู้วิจัยป้องกันการใช้ข้อมูลที่ไม่สนับสนุนบริษัทจากโดยการปิดชื่อองค์กรในการฝึกสอน และสร้างคำโฆษณาที่หลากหลาย น่าดึงดูด โดยใช้การฝึกสอนแบบมีเงื่อนไข
Description:	Thesis (Ph.D.)--Chulalongkorn University, 2021
Degree Name:	Doctor of Philosophy
Degree Level:	Doctoral Degree
Degree Discipline:	Computer Science and Information Technology
URI:	http://cuir.car.chula.ac.th/handle/123456789/79815
URI:	http://doi.org/10.58837/CHULA.THE.2021.121
metadata.dc.identifier.DOI:	10.58837/CHULA.THE.2021.121
Type:	Thesis
Appears in Collections:	Sci - Theses

Files in This Item:

File	Description	Size	Format
6173105023.pdf		3.71 MB	Adobe PDF	View/Open

Show full item record