Iterative error re-establishment to improve neural network cost function for imbalanced data learning

Perasut Rungcharassang

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/35925

Title:	Iterative error re-establishment to improve neural network cost function for imbalanced data learning
Other Titles:	การตั้งค่าความผิดพลาดใหม่แบบวนซ้ำเพื่อปรับปรุงฟังก์ชันค่าใช้จ่ายของโครงข่ายประสาทสำหรับการเรียนรู้ข้อมูลแบบไม่ดุล
Authors:	Perasut Rungcharassang
Advisors:	Chidchanok Lursinsap
Other author:	Chulalongkorn University. Faculty of Science
Advisor's Email:	Chidchanok.L@Chula.ac.th
Subjects:	Neural networks (Computer science) Algorithms Imbalanced Problem นิวรัลเน็ตเวิร์ค (คอมพิวเตอร์) อัลกอริทึม
Issue Date:	2011
Publisher:	Chulalongkorn University
Abstract:	A new training algorithm to enhance the accuracy of minority class in imbalanced data learning problem was proposed. This algorithm is based on the observation that the cause of lower accuracy is due to the domination of the error terms, i.e. the square of difference between the target and the actual output, computed by those data in majority class in the cost function. To resolve this domination, our cost function is re-established at each epoch based on the errors of the data in minority and majority classes. Any datum whose corresponding term in the cost function produces an error less than 0.05 is removed from cost function. Otherwise, it is put back into the cost function. Our algorithm adopting multilayer perceptron and Levenberg-Marquardt (LM) as the learning algorithm was compared with classical LM and the recent algorithm RAMOBoost on 15 well-known benchmarks. The experimental results of our approach produced higher accuracy than the other approaches in 13 cases with faster training speed.
Other Abstract:	ในงานวิจัยนี้ได้นำเสนอ ขั้นตอนวิธีการสอนใหม่เพื่อเพิ่มความแม่นยำของกลุ่มส่วนน้อยสำหรับปัญหาการเรียนรู้ข้อมูลไม่ดุล ขั้นตอนวิธีนี้ได้จากการสังเกตว่าสาเหตุของค่าความแม่นยำต่ำเนื่องจากอิทธิพลของพจน์ค่าคลาดเคลื่อน ซึ่งคำนวณได้จากผลต่างของเป้าหมายกับข้อมูลส่งออกที่แท้จริงทั้งหมดยกกำลังสอง โดยข้อมูลในกลุ่มส่วนน้อยเหล่านั้นในฟังก์ชันค่าใช้จ่าย เพื่อแก้ปัญหานี้ฟังก์ชันค่าใช้จ่ายจะถูกตั้งค่าใหม่ที่แต่ละรอบขึ้นอยู่กับค่าผิดพลาดของข้อมูลในกลุ่มส่วนน้อยและกลุ่มส่วนมาก ข้อมูลใดๆที่คำนวณค่าผิดพลาดได้น้อยกว่า 0.05 จะถูกคัดออกไปจากฟังก์ชันค่าใช้จ่าย ข้อมูลที่เหลือให้ใส่กลับเข้าไปในฟังก์ชันค่าใช้จ่าย ขั้นตอนวิธีการสอนใหม่นี้ถูกเปรียบเทียบกับวิธีเลเวนเบิร์ก-มาร์ควอดท์และวิธีลาโมบูสท์ บนชุดข้อมูลมาตรฐาน 15 ชุด จากผลการทดลองแสดงให้เห็นถึงความแม่นยำของวิธีการนี้ที่สูงขึ้นกว่าวิธีการอื่นๆใน 13 ตัวอย่าง พร้อมทั้งมีความเร็วในการสอนเร็วขึ้นอีกด้วย
Description:	Thesis (M.Sc.)--Chulalongkorn University, 2011
Degree Name:	Master of Science
Degree Level:	Master's Degree
Degree Discipline:	Applied Mathematics and Computational Science
URI:	http://cuir.car.chula.ac.th/handle/123456789/35925
URI:	http://doi.org/10.14457/CU.the.2011.74
metadata.dc.identifier.DOI:	10.14457/CU.the.2011.74
Type:	Thesis
Appears in Collections:	Sci - Theses

Files in This Item:

File	Description	Size	Format
perasut_ru.pdf		1.39 MB	Adobe PDF	View/Open

Show full item record