การเปรียบเทียบวิธีการตรวจสอบค่าผิดปกติในการวิเคราะห์การถดถอยเชิงเส้น

วศิรินทร์ วารีเศวตสุวรรณ

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/9946

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	มานพ วราภักดิ์	-
dc.contributor.author	วศิรินทร์ วารีเศวตสุวรรณ	-
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะพาณิชยศาสตร์และการบัญชี	-
dc.date.accessioned	2009-08-11T08:28:12Z	-
dc.date.available	2009-08-11T08:28:12Z	-
dc.date.issued	2545	-
dc.identifier.isbn	9741726341	-
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/9946	-
dc.description	วิทยานิพนธ์ (สต.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2545	en
dc.description.abstract	ศึกษาเปรียบเทียบความสามารถของวิธีการตรวจสอบค่าผิดปกติ ในการวิเคราะห์การถดถอยเชิงเส้น เมื่อค่าผิดปกติเกิดที่ตัวแปรตาม ซึ่งศึกษาวิธีการตรวจสอบค่าผิดปกติ 4 วิธีคือ การทดสอบของ Kianifard and Swallow ได้แก่ Sequential Recursive Method (SRM) และ Modified Recursive Method (MRM) การทดสอบของ S.R. Paul & Karen Y. Fung (PK) และการทดสอบของ Daniel Pena & Victor Yohai (PY) โดยกระทำภายใต้เงื่อนไขของการแจกแจงของความคลาดเคลื่อนสุ่ม 2 กรณีคือ กรณีไม่มีค่าผิดปกติซึ่งความคลาดเคลื่อนมีการแจกแจงปกติ และกรณีมีค่าผิดปกติเกิดขึ้นซึ่งความคลาดเคลื่อนมีการแจกแจงปกติปลอมปน (โดยศึกษาการแจกแจงปกติปลอมปนในตำแหน่งและการแจกแจงปกติปลอมปนในสเกล) ที่สัดส่วนการปลอมปนของความคลาดเคลื่อน 3 ระดับ คือ 0.05, 0.10 และ 0.15 ระดับค่าผิดปกติ 3 ระดับ คือ ระดับเล็กน้อย ระดับปานกลางและระดับรุนแรง จำนวนตัวแปรอิสระเท่ากับ 1 และ 3 ระดับขนาดตัวอย่าง 7 ระดับ คือ 20, 30, 40, 50, 60, 80 และ 100 ที่ระดับนัยสำคัญ 3 ระดับ คือ 0.01, 0.05 และ 0.10 ข้อมูลที่ใช้ในการวิจัยครั้งนี้ ได้จากการจำลองด้วยวิธีมอนติคาร์โล และกระทำซ้ำๆ กัน 500 ครั้ง ในแต่ละสถานการณ์ที่กำหนด ซึ่งการเปรียบเทียบจะใช้ค่าความน่าจะเป็นของความถูกต้องของการตรวจสอบ เป็นเครื่องมือวัดหรือมาตรวัด ดังนี้ ความน่าจะเป็นที่ตรวจถูกต้อง เมื่อข้อมูลไม่มีค่าผิดปกติ (P1) ความน่าจะเป็นที่ตรวจผิดพลาด เมื่อข้อมูลไม่มีค่าผิดปกติ (P2) ความน่าจะเป็นที่ตรวจถูกต้อง เมื่อข้อมูลมีค่าผิดปกติ (P3) ความน่าจะเป็นที่ตรวจผิดพลาด เมื่อข้อมูลมีค่าผิดปกติ (P4) และค่าเปอร์เซ็นต์รวมของการตรวจสอบถูกต้อง (TP%) ผลการวิจัยสรุปได้ดังนี้ พิจารณาค่าเปอร์เซ็นต์รวมของการตรวจสอบถูกต้อง (TP%) ซึ่งได้มาจากการคำนวณค่า P1, P2, P3 และ P4 จากการศึกษาทดลองในสถานการณ์ต่างๆ สรุปได้ 2 กรณีดังนี้ 1) กรณีความคลาดเคลื่อนมีการแจกแจงปกติปลอมปนในตำแหน่ง สรุปได้ดังนี้ กรณีสัดส่วนการปลอมปนระดับต่ำ (0.05) ตัวสถิติทดสอบ MRM มีค่า TP% สูงสุด ที่ขนาดตัวอย่างเท่ากับ 20 ทุกจำนวนตัวแปรอิสระและทุกระดับนัยสำคัญ รองลงมาคือ SRM, PK และ PY ตามลำดับ เมื่อจำนวนขนาดตัวอย่างเพิ่มขึ้นตัวสถิติทดสอบ SRM จะมีค่า TP% สูงสุด รองลงมาคือ PK, PY และ MRM ตามลำดับ กรณีสัดส่วนการปลอมปนระดับปานกลางถึงสูง (0.10-0.15) ตัวสถิติทดสอบ PY มีค่า TP% สูงสุด ที่ทุกระดับขนาดตัวอย่าง ทุกจำนวนตัวแปรอิสระและทุกระดับนัยสำคัญ รองลงมาคือ SRM, PK และ MRM ตามลำดับ 2) กรณีความคลาดเคลื่อนมีการแจกแจงปกติปลอมปนในสเกล สรุปได้ดังนี้ กรณีสัดส่วนการปลอมปนระดับต่ำ (0.05) เหมือนผลสรุปใน กรณีความคลาดเคลื่อนมีการแจกแจงปกติปลอมปนในตำแหน่ง ที่สัดส่วนการปลอมปนเท่ากับ 0.05 กรณีสัดส่วนการปลอมปนระดับปานกลางถึงสูง (0.10-0.15) ตัวสถิติทดสอบ SRM มีค่า TP% สูงสุดที่ทุกระดับขนาดตัวอย่าง ทุกจำนวนตัวแปรอิสระและทุกระดับนัยสำคัญ รองลงมาคือ PK, PY และ MRM ตามลำดับ	en
dc.description.abstractalternative	To compare the capacity of detecting outlier methods in linear regression analysis when outliers are occur in independent variable. The detecting outlier methods are Kianifard and Swallow Method (Sequential Recursive Method : SRM and Modified Recursive Method : MRM), S.R.Paul & Karen Y.Fung Method (PK) and Daniel Pena & Victor Yohai Method (PY). The comparison was done under the following conditions. The distributions of random error are normal distribution (In case of none outlier) and contaminated normal distribution (In case of outlier is present). The sizes of the outliers of dependent variable are small, medium and large level according to the proportion of the contamination of 0.05, 0.10 and 0.15. The independent variables are 1 and 3. The sample sizes are 20, 30, 40, 50, 60, 80 and 100. The levels of significant level are 0.01, 0.05 and 0.10. The data of this experiment were generated through the Monte Carlo Simulation Technique. The experiment was repeated 500 times under each condition to compare the probability of correct detecting that is measurement such as the probability of correct detecting when data without outlier (P1), the probability of incorrect detecting when data without outlier (P2), the probability of correct detecting when data with outlier (P3) the probability of incorrect detecting when data with outlier (P4) and percent of total correct detecting (TP%). Result of this research can be summarized as follows:Percent of total correct detecting (TP%), which is calculate from P1, P2, P3, and P4. Result of this research has 2 cases as follows 1) The random errors are location-contaminate normal distribution. The proportion of the contamination is a small level. The TP% of MRM method is the highest, as the sample size is 20 at all the independent variable and all levels of significant level. The TP% of SRM, PK and PY method is lower, respectively. The TP% of SRM method is the highest when the larger sample size The TP% of PK, PY and MRM method is lower, respectively. The proportion of the contamination is a medium and a large level. In all levels of sample size, those of independent variable and those of significant level, the TP% of PY method is the highest. The TP% of SRM, PK and MRM method is lower, respectively. 2) The random errors are location-contaminate normal distribution. The proportion of the contamination is a small level. This result is the same as that in location-contaminate normal distribution. The proportion of the contamination is a medium and a large level. All levels of the sample size, all the independent variable and all levels of significant level, the TP% of SRM method is the highest PK, PY and MRM method is lower, respectively	en
dc.format.extent	1228663 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	th	es
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.relation.uri	http://doi.org/10.14457/CU.the.2002.435	-
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.subject	ข้อมูลผิดปกติ (สถิติ)	en
dc.subject	การวิเคราะห์การถดถอย	en
dc.title	การเปรียบเทียบวิธีการตรวจสอบค่าผิดปกติในการวิเคราะห์การถดถอยเชิงเส้น	en
dc.title.alternative	A comparison on detecting outlier methods in linear regression analysis	en
dc.type	Thesis	es
dc.degree.name	สถิติศาสตรมหาบัณฑิต	es
dc.degree.level	ปริญญาโท	es
dc.degree.discipline	สถิติ	es
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.email.advisor	fcommva@acc.chula.ac.th, Manop.V@Chula.ac.th	-
dc.identifier.DOI	10.14457/CU.the.2002.435	-
Appears in Collections:	Acctn - Theses

Files in This Item:

File	Description	Size	Format
Wasirin.pdf		1.2 MB	Adobe PDF	View/Open

Show simple item record