การเปรียบเทียบวิธีการใส่ค่าข้อมูลสูญหายแบบนอนอิกนอร์เรเบิลในการวิเคราะห์อนุกรมเวลาที่มีคุณสมบัติคงที่

ธีรเดช สิงห์อินทร์

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/50394

Title:	การเปรียบเทียบวิธีการใส่ค่าข้อมูลสูญหายแบบนอนอิกนอร์เรเบิลในการวิเคราะห์อนุกรมเวลาที่มีคุณสมบัติคงที่
Other Titles:	Comparison of the imputation methods for nonignorable missing data in time series analysis with stationary
Authors:	ธีรเดช สิงห์อินทร์
Advisors:	อนุภาพ สมบูรณ์สวัสดี
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะพาณิชยศาสตร์และการบัญชี
Advisor's Email:	Anupap.S@Chula.ac.th,mr.anupap@gmail.com,anupap@cbs.chula.ac.th,mr.anupap@gmail.com
Subjects:	ข้อมูลสูญหาย (สถิติ) การวิเคราะห์อนุกรมเวลา Missing observations (Statistics) Time-series analysis
Issue Date:	2558
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	ข้อมูลสูญหายไปในข้อมูลอนุกรมเวลาเป็นปัญหาที่พบบ่อยในการวิเคราะห์ทางสถิติ ซึ่งอาจเกิดขึ้นเนื่องจากสาเหตุหลายๆประการเพื่อที่จะประมาณค่าสูญหายให้เกิดความถูกต้องแม่นยำนั้นเป็นสิ่งจำเป็นที่จะต้องเลือกวิธีการที่เหมาะสมซึ่งขึ้นอยู่กับชนิดและกลไกที่ทำให้เกิดค่าสูญหาย เพื่อทำให้ค่าประมาณที่ได้ดีที่สุดของค่าสูญหาย ในการศึกษาครั้งนี้ได้มีวัตถุประสงค์เพื่อเปรียบเทียบวิธีการใส่ค่าสูญหายสำหรับการวิเคราะห์อนุกรมเวลาที่มีข้อมูลสูญหาย การศึกษานี้ ใช้วิธี Mean Imputation วิธี LOCF และวิธี EM Algorithm ข้อมูลที่ใช้ในการศึกษาได้จากการจำลองข้อมูล โดยมีสัดส่วนการสูญหาย 3 ระดับ คือ 10%, 20% และ 30% มีระดับการสูญหายแบบนอนอิกนอร์เรเบิล 3 ระดับ คือ ไม่มี ปานกลาง และสูง จากการเปรียบเทียบแต่ละวิธีการโดยใช้ค่าเฉลี่ยของค่าเฉลี่ยค่าสัมบูรณ์เปอร์เซ็นต์ความคลาดเคลื่อน (Average Mean Absolute Percentage Error : AMAPE) พบว่า i) สำหรับตัวแบบAR(1) วิธีการใส่ค่าสูญหายวิธี Mean Imputation จะมีประสิทธิภาพดีที่สุดในกรณีที่ขนาดตัวอย่างมีขนาดเล็ก (n=50,100) และพารามิเตอร์แสดงค่าของกระบวนการถดถอยในตัวอันดับที่ 1 เป็น 0.2 ii) วิธี EM Algorithm มีประสิทธิภาพดีที่สุดในกรณีที่พารามิเตอร์แสดงค่าของกระบวนการถดถอยในตัวอันดับที่ 1 เป็น 0.5 iii) วิธี LOCF มีประสิทธิภาพดีที่สุดในกรณีที่ขนาดตัวอย่างมีขนาดเล็ก (n=50,100) และพารามิเตอร์แสดงค่าของกระบวนการถดถอยในตัวอันดับที่ 1 เป็น 0.8 iv) สำหรับตัวแบบAR(2) วิธีการใส่ค่าสูญหายวิธี Mean Imputation จะมีประสิทธิภาพดีที่สุดในกรณีที่พารามิเตอร์แสดงค่าของกระบวนการถดถอยในตัวอันดับที่ 1 และ2 เป็น 0.1 v) วิธีการใส่ค่าสูญหายวิธี Mean Imputation จะมีประสิทธิภาพดีที่สุดในกรณีที่ขนาดตัวอย่างมีขนาดเล็ก (n=50) และ ในกรณีที่พารามิเตอร์แสดงค่าของกระบวนการถดถอยในตัวอันดับที่ 1 และ2 เป็น 0.25 vi) วิธีการใส่ค่าสูญหายวิธี EM Algorithmจะมีประสิทธิภาพดีที่สุดในกรณีที่พารามิเตอร์แสดงค่าของกระบวนการถดถอยในตัวอันดับที่ 1 และ2 เป็น 0.4
Other Abstract:	Missing data in time series data is a common problem in statistical analysis that occurs due to many reasons. In order to estimate missing values accurate, it is necessary to select an appropriate method depending on the type and mechanisms generating missing values so as to obtain the best possible estimates of missing values. The purpose of this study is to compare the imputation methods for time series analysis with missing data. The imputation methods were Mean imputation, LOCF, and EM Algorithm. The data were simulated under three levels of missing percentages of data 10%, 20% and 30%, three levels of nonignorable-missingness of none, medium, high. The comparison of each imputation methods using the size of average mean absolute percentage error (AMAPE), the findings are the followings: i) for first order autoregressive model, Mean Imputation perform test when the sample size is small (n=50,100) and parameter first order autoregressive process equal 0.2, ii) EM Algorithm perform best when parameter first order autoregressive process equal 0.5, iii) LOCF perform best when the sample size is small (n=50,100) and parameter first order autoregressive process equal 0.8, iv) for second order autoregressive model, Mean Imputation perform best when parameter first order autoregressive process and second order autoregressive process equal 0.1, v) Mean Imputation perform best when the sample size is small (n=50) and parameter first order autoregressive process and second order autoregressive process equal 0.25, vi) EM Algorithm perform best when parameter first order autoregressive process and second order autoregressive process equal 0.4.
Description:	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2558
Degree Name:	วิทยาศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	สถิติ
URI:	http://cuir.car.chula.ac.th/handle/123456789/50394
URI:	http://doi.org/10.14457/CU.the.2015.972
metadata.dc.identifier.DOI:	10.14457/CU.the.2015.972
Type:	Thesis
Appears in Collections:	Acctn - Theses

Files in This Item:

File	Description	Size	Format
5681541426.pdf		13.9 MB	Adobe PDF	View/Open

Show full item record