การเปรียบเทียบประสิทธิภาพของวิธีทดแทนค่าสูญหายในข้อมูลพหุระดับ: การประยุกต์ใช้กับการวิเคราะห์ความเหลื่อมล้ำทางการศึกษา

นวลรัตน์ ฉิมสุด

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/80961

Title:	การเปรียบเทียบประสิทธิภาพของวิธีทดแทนค่าสูญหายในข้อมูลพหุระดับ: การประยุกต์ใช้กับการวิเคราะห์ความเหลื่อมล้ำทางการศึกษา
Other Titles:	Comparison of the efficiency of imputation methods in multilevel data: applications to educational inequality analysis
Authors:	นวลรัตน์ ฉิมสุด
Advisors:	ประภาศิริ รัชชประภาพรกุล สิวะโชติ ศรีสุทธิยากร
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะครุศาสตร์
Issue Date:	2564
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	การวิจัยครั้งนี้มีวัตถุประสงค์เพื่อ (1) เพื่อเปรียบเทียบประสิทธิภาพของวิธีทดแทนค่าข้อมูลสูญหาย 3 วิธี ได้แก่วิธี MI-FCS, วิธี RF และวิธี Opt.impute ซึ่งประกอบด้วย วิธี Opt.knn , Opt.tree, วิธี Opt.svm, และวิธี Opt.cv โดยใช้การจำลองข้อมูลและนำผลที่ได้มาประยุกต์ใช้กับข้อมูลจริง (2) เพื่อวิเคราะห์ความเหลื่อมล้ำทางการศึกษา ด้วยโมเดลพหุระดับโดยใช้ข้อมูลที่มีการทดแทนค่าสูญหาย และเปรียบเทียบผลที่ได้ กับการวิเคราะห์ความเหลื่อมล้ำทางการศึกษาที่ไม่ได้ทดแทนค่าสูญหาย ผลการวิจัยพบว่า (1) จากการพิจารณาผลการเปรียบเทียบประสิทธิภาพของวิธีทดแทนค่าสูญหายโดยใช้การจำลองข้อมูลในภาพรวม จะพบว่าส่วนใหญ่วิธีทดแทนค่าสูญหาย Otp.impute มีแนวโน้มให้ประสิทธิภาพสูงที่สุด รองลงมาคือ วิธีทดแทนค่าสูญหาย RF และวิธีทดแทนค่าสูญหาย MI – FCS ตามลำดับ (2) ผู้วิจัยรวบรวมข้อมูลทุติยภูมิของนักเรียนชั้นมัธยมศึกษาปีที่ 3 จากสถาบันทดสอบทางการศึกษาแห่งชาติ (สทศ.) ปีการศึกษา 2563 จำนวน 2,109 โรงเรียนที่อยู่ในสังกัดสำนักเขตพื้นที่การศึกษามัธยมศึกษา(สพม.) นำวิธีทดแทนค่าสูญหายที่ได้จากการจำลองข้อมูลมาประยุกต์ใช้กับข้อมูลทุติยภูมิดังกล่าว ผลการวิจัย จะพบว่าสัดส่วนของนักเรียนที่ครอบครัวขาดแคลนทุนทรัพย์และไม่ได้พักอาศัยอยู่กับบิดามารดาระดับโรงเรียน ส่งผลกระทบต่อผลสัมฤทธิ์ ทางการเรียนของนักเรียนระดับโรงเรียน อย่างมีนัยสำคัญทางสถิติ โดยผลกระทบที่เกิดขึ้นสะท้อนให้เห็นถึงความเหลื่อมล้ำทางการศึกษา และเมื่อเปรียบเทียบผลที่ได้กับการวิเคราะห์ความเหลื่อมล้ำทางการศึกษาที่ไม่ได้ทดแทนค่าสูญหาย แสดงให้เห็นว่าหากนำข้อมูลวิเคราะห์ผลการวิจัยโดยไม่คำนึงถึงค่าสูญหาย หรือตัดค่าสูญหายทิ้ง อาจจะส่งผลกระทบต่อการประมาณค่าพารามิเตอร์ที่แท้จริง อย่างมีนัยสำคัญทางสถิติ หรือไม่สามารถอนุมานไปสู่ประชากรได้อย่างถูกต้องและแม่นยำ
Other Abstract:	The purposes of this research were to (1) compare the efficiency of multiple Imputation methods of multilevel missing data as the three methods of the Imputation included Multiple Imputation Fully Conditional Specification (MI – FCS), Random Forest (RF), and the four type of Optimal Impute (opt. impute) include Opt.knn , Opt.tree, Opt.svm and Opt.cv . A simulation study based on real-population educational data with a random coefficients model then applied the results to real data. (2) analyze the educational inequality using the multilevel analysis the using data had substituted the missing values already. Compare the results with the analysis of educational inequality that does not compensate for the missing value. The findings were as follows: (1) the Opt.impute method has the highest efficiency. This method can reduce constraints. It's flexible and increases the efficiency of multiple imputation method in multilevel data. Although types of missing data are complex and severe. followed by RF method and MI – FCS method respectively. ( 2 ) The researcher applied the secondary data of Junior High School students at the school level. The data of 2,109 schools within 42 educational clusters in the year 2020 from "The National Institute of Educational Testing Service ( Public Organization) or NIETS ". Results from the study show the influence of Proportion of students whose families are underfunded and do not live with their parents school level on achievement in each education. Service areas toward educational inequality were low, while school under influences achievement significantly, the impact reflects the educational inequality. The results were compared with the analysis of educational inequality that does not compensate for missing values . The research findings show that if the data is analyzed regardless of the missing value or the value is omitted, it cannot be inferred to the population accurately and precisely or it may affect the actual parameter estimation. statistically significant or cannot be accurately and accurately inferred to the population.
Description:	วิทยานิพนธ์ (ค.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2564
Degree Name:	ครุศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	สถิติและสารสนเทศการศึกษา
URI:	http://cuir.car.chula.ac.th/handle/123456789/80961
URI:	http://doi.org/10.58837/CHULA.THE.2021.1066
metadata.dc.identifier.DOI:	10.58837/CHULA.THE.2021.1066
Type:	Thesis
Appears in Collections:	Edu - Theses

Files in This Item:

File	Description	Size	Format
6282024627.pdf		3.17 MB	Adobe PDF	View/Open

Show full item record