ตัวแบบการถดถอยโลจิสติกแบบ 2 ประเภท สำหรับการพยากรณ์การจำแนกข้อมูลไม่จัดกลุ่ม

ศรีวิตรา ศิริวิสุทธิรัตน์

DSpace Home
→
Faculty and Institute
→
Faculty of Commerce and Accountancy - Acctn
→
Acctn - Theses
→
View Item

dc.contributor.advisor	สุพล ดุรงค์วัฒนา
dc.contributor.author	ศรีวิตรา ศิริวิสุทธิรัตน์
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะพาณิชยศาสตร์และการบัญชี
dc.date.accessioned	2012-03-24T03:18:48Z
dc.date.available	2012-03-24T03:18:48Z
dc.date.issued	2553
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/18478
dc.description	วิทยานิพนธ์ (สต.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2553	en
dc.description.abstract	หาจุดแบ่งที่เหมาะสมที่สุดสำหรับการพยากรณ์การจำแนกข้อมูลไม่จัดกลุ่มในตัวแบบการถดถอยโลจิสติกแบบ 2 ประเภท ปัจจัยที่สนใจศึกษาในครั้งนี้คือ สัดส่วนของความล้มเหลวของลักษณะที่สนใจศึกษา (a) เท่ากับ 0.1, 0.5 และ 0.9 ระดับความสัมพันธ์ระหว่างตัวแปรอิสระ (M) เท่ากับ 0, 0.33, 0.67 และ 0.99 ขนาดตัวอย่าง (n) แบ่งเป็น 3 ระดับ คือเล็ก (n = 20, 40) ปานกลาง (n = 60, 80) และใหญ่ (n=100, 120) และจำนวนตัวแปรอิสระ (p) แบ่งเป็น 3 ระดับ คือระดับน้อย (p = 1, 2) ปานกลาง (p = 3, 4) และ มาก (p = 5, 6) ข้อมูลทัง้หมดนี้ใช้การจำลองโดยเทคนิคมอนติคาร์โล ด้วยโปรแกรม R การหาจุดแบ่งใช้ทฤษฎีของ Hadjicostas P. (2006) ผลการวิจัยสรุปได้ดังนี้ กรณีสัดส่วนของความล้มเหลวของลักษณะที่สนใจศึกษา เปลี่ยนแปลง แต่ปัจจัยอื่นๆ คงที่ พบว่า ที่สัดส่วนของความล้มเหลวของลักษณะที่สนใจศึกษามีค่าเท่ากับ 0.5 ค่าจุดแบ่งมีค่า ลู่เข้าสู่ 0.5 แต่ที่ค่าอื่นๆ ค่าจุดแบ่งมีค่าต่ำกว่า 0.5 กรณีระดับความสัมพันธ์ระหว่างตัวแปรอิสระ เพิ่มขึ้น แต่ปัจจัยอื่นๆ คงที่ พบว่า ที่สัดส่วนของความล้มเหลวของลักษณะที่สนใจศึกษามีค่า เท่ากับ 0.5 ค่าจุดแบ่งมีแนวโน้มลดลงจาก 0.5 แต่ที่ค่าอื่นๆ ค่าจุดแบ่งจะมีค่าลดลง จนถึงระดับ ความสัมพันธ์ระหว่างตัวแปรอิสระเท่ากับ 0.67 และจะเพิ่มขึ้นเล็กน้อย กรณีขนาดตัวอย่างเพิ่มขึ้น แต่ปัจจัยอื่นๆ คงที่ พบว่า ที่จำนวนตัวแปรอิสระ อยู่ในระดับน้อย ค่าจุดแบ่งมีค่าลู่เข้าสู่ 0.5 แต่ที่จำนวนตัวแปรอิสระ อยู่ในระดับอื่นๆ ค่าจุดแบ่งมีค่าต่ากว่า 0.5 กรณีจำนวนตัวแปร อิสระเพิ่มขึ้น แต่ปัจจัยอื่นๆ คงที่ พบว่า ค่าจุดแบ่งที่สัดส่วนของความล้มเหลวของลักษณะที่สนใจศึกษา มีค่าเท่ากับ 0.1 และ 0.9 มีค่าลู่เข้าสู่ค่าจุดแบ่งที่สัดส่วนของความล้มเหลวของลักษณะที่สนใจศึกษา มีค่าเท่ากับ 0.5 ซึ่งมีค่าประมาณ 0.5 จากการประมาณค่าของจุดแบ่ง สำหรับสถานการณ์ทั้งหมด จากตัวแบบการถดถอยโลจิสติกแบบ 2 ประเภทที่มีผลอันตรกิริยา พบว่าค่าสัมประสิทธิ์การตัดสินใจ (R2) มีค่าสูง แสดงว่าสมการการถดถอยมีความเหมาะสมมาก สามารถใช้ประมาณค่าจุดแบ่งที่เหมาะสมที่สุดในสถานการณ์อื่นๆ ได้	en
dc.description.abstractalternative	To find out the optimal cut-off point for predictive classification of ungrouped data using binary logistic regression model. The interesting factors are the failure rate (a) of the values 0.1, 0.5 and 0.9, degree of multicollinearity among independent variables (M) of the values 0, 0.33, 0.67 and 0.99, sample size (n) with 3 levels ; low level (n=20, 40), medium level (n=60, 80) and high level (n=100,120), and the number of independent variables (p) with 3 levels; low level (p=1, 2), medium level (p= 3, 4) and high level (p=5, 6). The data are generated using Monte Carlo technique through R-program. The cut-off point is captured using Hadjicostas P. (2006) theory. The results are summarized as follow: As the failure rate changes and the other factors are kept constant, the optimal cut-off point converges to 0.5 when the failure rate set also to 0.5. For the other situations, the optimal cut-off point is under 0.5. As the degree of multicollinearity increases and the other factors are kept constant, while the failure rate equals to 0.5, the trend of optimal cut-off point decreases from 0.5. For the other situations, the optimal cut-off point decreases until the degree of multicollinearity equals to 0.67 and after that it slightly increases. As the sample size increases and the other factors are kept constant, the optimal cut-off point converges to 0.5 when the sample size is small. For the other situations, the optimal cut-off point is under 0.5. As the number of independent variables increases and the other factors are kept constant, with the failure rate equal to 0.1 and 0.9, the optimal cut-off point converges approximately to 0.5 as the failure rate equals to 0.5. Finally the estimated binary logistic regression model with all interaction terms is needed to find the estimated cut-off point for all situations. The R2 is needed to measure the goodness-of-fit of the estimated model. From the estimated regression equation, the optimal cut-off point for any situation can be predicted.	en
dc.format.extent	4726822 bytes
dc.format.mimetype	application/pdf
dc.language.iso	th	es
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.relation.uri	http://doi.org/10.14457/CU.the.2010.579
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.subject	การวิเคราะห์การถดถอยโลจิสติก
dc.subject	การหาค่าเหมาะที่สุดเชิงคณิตศาสตร์
dc.subject	Logistic regression analysis
dc.subject	Mathematical optimization
dc.title	ตัวแบบการถดถอยโลจิสติกแบบ 2 ประเภท สำหรับการพยากรณ์การจำแนกข้อมูลไม่จัดกลุ่ม	en
dc.title.alternative	Binary logistic regression model for ungrouped data predictive classification	en
dc.type	Thesis	es
dc.degree.name	สถิติศาสตรมหาบัณฑิต	es
dc.degree.level	ปริญญาโท	es
dc.degree.discipline	สถิติ	es
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en
dc.email.advisor	Supol.D@Chula.ac.th
dc.identifier.DOI	10.14457/CU.the.2010.579