การคัดเลือกและสกัดคุณลักษณะเพื่อการจำแนกอารมณ์จากเสียงพูดภาษาไทย

เมธี เจริญดี

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/54912

Title:	การคัดเลือกและสกัดคุณลักษณะเพื่อการจำแนกอารมณ์จากเสียงพูดภาษาไทย
Other Titles:	Feature Selection and Extraction for Thai Emotional Speech Classification
Authors:	เมธี เจริญดี
Advisors:	อติวงศ์ สุชาโต โปรดปราน บุณยพุกกณะ
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	Atiwong.S@Chula.ac.th,atiwong@gmail.com,atiwong.s@chula.ac.th citation.car.chula@gmail.com
Issue Date:	2559
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	มนุษย์ใช้เสียงพูดในการถ่ายทอดเรื่องราวตลอดจนอารมณ์ความรู้สึก โดยไม่เพียงใช้สื่อสารระหว่างมนุษย์ด้วยกันเท่านั้น แต่เสียงยังสามารถใช้ในการสื่อสารระหว่างมนุษย์และคอมพิวเตอร์ได้อีกด้วย ซึ่งหากคอมพิวเตอร์เข้าใจอารมณ์ที่มนุษย์ต้องการสื่อย่อมทำให้สามารถตอบสนองมนุษย์ได้อย่างมีประสิทธิภาพมากยิ่งขึ้น การทำให้คอมพิวเตอร์สามารถรู้จำอารมณ์จากเสียงได้อย่างมีประสิทธิภาพจึงนับเป็นสิ่งสำคัญ วิทยานิพนธ์นี้จึงเสนอขั้นตอนวิธีในการรู้จำอารมณ์จากเสียงพูดบนคลังข้อมูลอารมณ์จากละครไทย ด้วยการใช้คุณลักษณะที่สกัดจากระดับเซกเมนต์มาคำนวณค่าทางสถิติและลดจำนวนคุณลักษณะด้วยการวิเคราะห์องค์ประกอบหลักโดยใช้เคอร์เนล แล้วจำแนกด้วยซัพพอร์ตเวกเตอร์แมชชีน จากการทดลองพบว่าขั้นตอนวิธีที่เสนอให้ค่า F-measure สูงกว่าชุดข้อมูลอ้างอิงพื้นฐาน 1 และ 2 ซึ่งใช้คุณลักษณะที่สกัดจากระดับถ้อยความที่ร้อยละ 16.12 และร้อยละ 13.16 ตามลำดับ และให้ค่าความระลึกเฉลี่ยแบบให้น้ำหนักคลาสเท่ากัน (ค่าความแม่นยำแบบไม่ถ่วงน้ำหนัก) สูงกว่าชุดข้อมูลอ้างอิงพื้นฐาน 1 และ 2 ที่ร้อยละ 21.65 และร้อยละ 18.82 ตามลำดับ
Other Abstract:	Humans express their stories and emotions through speech. It is used not only in human-to-human communication but also in communication between humans and computers. If computers understand the emotions that humans mean to convey, they should be able to interact to humans more effectively. So, it is important to make the computers effectively recognize emotions. This thesis proposes an algorithm for speech emotion recognition on EMOLA corpus. By using features extracted from the segment level, it then performed statistical calculation and reduced the number of features using Kernel Principal Component Analysis. After the classification task using support vector machine, the experiment shows that the proposed algorithm yields F-measure higher than the reference baseline data set 1 and 2, using features from utterance level of 16.12% and 13.16%, respectively. It gives macro-averaged recall (unweighted accuracy) higher than both baseline data sets at 21.65% and 18.82%, respectively.
Description:	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2559
Degree Name:	วิทยาศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิทยาศาสตร์คอมพิวเตอร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/54912
URI:	http://doi.org/10.58837/CHULA.THE.2016.816
metadata.dc.identifier.DOI:	10.58837/CHULA.THE.2016.816
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
5670553121.pdf		3.57 MB	Adobe PDF	View/Open

Show full item record