การจำแนกกลุ่มข้อมูลโดยอัลกอริทึม ANOVAID

นวทิพย์ ไมตรี

DSpace Home
→
Faculty and Institute
→
Faculty of Commerce and Accountancy - Acctn
→
Acctn - Theses
→
View Item

dc.contributor.advisor	สุพล ดุรงค์วัฒนา	en_US
dc.contributor.author	นวทิพย์ ไมตรี	en_US
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะพาณิชยศาสตร์และการบัญชี	en_US
dc.date.accessioned	2016-12-02T02:06:22Z
dc.date.available	2016-12-02T02:06:22Z
dc.date.issued	2558	en_US
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/50911
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2558	en_US
dc.description.abstract	งานวิจัยฉบับนี้มีวัตถุประสงค์เพื่อศึกษากระบวนการจำแนกกลุ่มข้อมูลโดยอัลกอริทึม ANOVAID ซึ่งเป็นส่วนผสมของการใช้การวิเคราะห์ความแปรปรวนทางเดียวและสถิติทดสอบ t สำหรับกลุ่มตัวอย่าง 2 กลุ่มที่เป็นอิสระกัน โดยตัวแปรตามเป็นตัวแปรเชิงปริมาณและตัวแปรอิสระเป็นตัวแปรเชิงคุณภาพ อัลกอริทึมนี้มีขั้นตอนในการทำงาน 2 ขั้นตอน คือ ขั้นตอนในการคัดเลือกตัวแปรอิสระและขั้นตอนในการรวมกลุ่มของตัวแปรอิสระนั้น โดยในการคัดเลือกตัวแปรอิสระนั้น จะพิจารณาจากค่า p-value น้อยสุด จากการวิเคราะห์ความแปรปรวนทางเดียว เมื่อเปรียบเทียบกันระหว่างตัวแปรอิสระทั้งหมด โดยที่ค่า p-value ต้องมีนัยสำคัญด้วยจึงจะเลือกตัวแปรอิสระนั้นเข้ามาในกระบวนการ จากนั้นจะใช้สถิติทดสอบ t สำหรับกลุ่มตัวอย่าง 2 กลุ่มที่เป็นอิสระกันในการรวมกลุ่มของตัวแปรอิสระที่ถูกเลือกเข้ามา โดยพิจารณาจากค่า p-value ที่ไม่มีนัยสำคัญ ถ้าไม่ตรงตามเงื่อนไขข้างต้นอัลกอริทึมจะหยุดทำงาน และสำหรับแต่ละกลุ่มที่จำแนกมาได้ ตัวแปรอิสระที่เหลือจะถูกจำแนกแยกกันและเป็นอิสระกัน จนกระทั่งไม่มีตัวแปรอิสระเหลือหรืออัลกอริทึมหยุดการทำงาน โดยข้อมูลที่ใช้ในการศึกษาจะจำลองภายใต้จำนวนกลุ่มของปัจจัยเท่ากับ 2, 3 และ 4, ขนาดข้อมูลเท่ากับ 6,000, 12,000 และ 24,000, ความแปรปรวนเท่ากับ 10,000 และ 40,000 และอัตราส่วนของค่าเฉลี่ยเท่ากับ 0.5, 1 และ 2 โดยทำการทดสอบที่ระดับนัยสำคัญเท่ากับ 0.05 และใช้เปอร์เซ็นต์ความผิดพลาดในการจำแนกกลุ่มเป็นเกณฑ์ในการพิจารณาว่าอัลกอริทึมมีประสิทธิภาพในการจำแนกกลุ่มได้ดีหรือไม่ จากผลการศึกษาพบว่าเมื่อความแปรปรวนเพิ่มขึ้น เปอร์เซ็นต์ความผิดพลาดในการจำแนกกลุ่มจะมีแนวโน้มเพิ่มขึ้น, เมื่อขนาดข้อมูลเพิ่มขึ้น เปอร์เซ็นต์ความผิดพลาดในการจำแนกกลุ่มจะมีแนวโน้มลดลง, เมื่ออัตราส่วนของค่าเฉลี่ยเพิ่มขึ้น เปอร์เซ็นต์ความผิดพลาดในการจำแนกกลุ่มจะมีแนวโน้มลดลง และเมื่อจำนวนกลุ่มของปัจจัยเพิ่มขึ้น เปอร์เซ็นต์ความผิดพลาดในการจำแนกกลุ่มไม่แตกต่างกัน	en_US
dc.description.abstractalternative	The aim of this paper is to study the classification process of ANOVAID algorithm which is the mixture of one-way ANOVA and independent-sample t-test. The dependent variable is the quantitative variable and the independent variable is the fixed qualitative variable. There are 2 steps in this algorithm. Those are independent variable selection and merging steps. Each independent variable is selected using the least p-value of the one-way ANOVA when the least p-value of the selected independent variable shows the statistical significance to enter or to be selected, then the independent-sample t-test is used to merge the data by using the insignificance p-value otherwise the algorithm will be stopped. In each of merging group, the next hierarchy for the rest of independent variables will be classified separately and independently and so on until there is no independent variable to classify or the algorithm is stopped. The data are simulated under several situations. Each situation depends upon the numbers of levels in factor are 2, 3 and 4, the sample size of each set of data are 6,000, 12,000 and 24,000, the variance of random error in the one-way ANOVA model are 10,000 and 40,000, and lastly the ratio of means are 0.5, 1 and 2 at the hypothesis testing is 0.05. In the study, the percentage of misclassification is used as the measure how good the algorithm. The results of the study show that when the value of variance for random error increases, the percentage of misclassification also increase; when the number of sample size increases, then the percentage of misclassification decreases; when the ratio of mean increases, then the percentage of misclassification decreases; and when the numbers of levels in factor increases, then the percentage of misclassification is indifferent.	en_US
dc.language.iso	th	en_US
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.relation.uri	http://doi.org/10.14457/CU.the.2015.965
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.subject	สถิติ -- การประมวลผลข้อมูล
dc.subject	ข้อมูล -- การจำแนก
dc.subject	อัลกอริทึม
dc.subject	Statistics -- Data processing
dc.subject	Algorithms
dc.title	การจำแนกกลุ่มข้อมูลโดยอัลกอริทึม ANOVAID	en_US
dc.title.alternative	Data classification by ANOVAID algorithm	en_US
dc.type	Thesis	en_US
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	en_US
dc.degree.level	ปริญญาโท	en_US
dc.degree.discipline	สถิติ	en_US
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.email.advisor	Supol.D@Chula.ac.th,supol@cbs.chula.ac.th	en_US
dc.identifier.DOI	10.14457/CU.the.2015.965