กรอบงานสารสนเทศควบรวมสำหรับการค้นคืนเอกสารมีโครงสร้างในองค์กร

นัทธี ศรีหาจักษ์

dc.contributor.advisor	ญาใจ ลิ่มปิยะกรณ์	en_US
dc.contributor.author	นัทธี ศรีหาจักษ์	en_US
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์	en_US
dc.date.accessioned	2015-06-24T06:45:16Z
dc.date.available	2015-06-24T06:45:16Z
dc.date.issued	2556	en_US
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/43828
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2556	en_US
dc.description.abstract	การค้นหากลุ่มเอกสารที่มีลักษณะสัมพันธ์กันของบริบทเป็นสิ่งที่ท้าทาย เนื่องจากเป็นการยากที่จะประเมินได้ว่าเอกสารที่ได้มานั้นมีเนื้อหาที่ถูกต้อง เหมาะสมและตรงตามความต้องการของผู้ใช้ งานวิจัยนี้จึงได้นำเสนอกรอบงานสารสนเทศควบรวม เพื่อรวบรวมสาระสำคัญที่น่าสนใจและเหมาะสมจากเอกสารที่ได้จากการค้นคืน ซึ่งเป็นเอกสารมีโครงสร้างในรูปแบบเอกซ์เอ็มแอล แนวทางที่นำเสนอประกอบด้วย 2 ส่วนหลัก คือ ส่วนการค้นคืนสารสนเทศจากเอกสาร และส่วนการนำเสนอสารสนเทศ โดยส่วนการค้นคืนสารสนเทศจากเอกสารมีโครงสร้าง ทำหน้าที่แยกส่วน รวบรวมและพิจารณาบริบทในเอกสารเพื่อสกัดสาระสำคัญที่เหมาะสมและตรงตามความต้องการของผู้ใช้งานด้วยเทคนิคการสืบค้นข้อมูลเอกซ์เอ็มแอล ซึ่งใช้ภาษาเอกซ์คิวรีและวิธีการแท็กข้อมูลด้วยคำศัพท์ควบคุมที่ประกอบด้วยคำสำคัญและคำที่มีความหมายใกล้เคียง เพื่อจัดทำเป็นดัชนีด้วยภาษาเอกซ์พาธ ชุดข้อมูลผลลัพธ์จากการสืบค้นจะถูกนำมาหาความสัมพันธ์ของบริบทด้วยเทคนิควิธีการจัดกลุ่มโดยใช้อัลกอริทึมเค-มีนส์ และตัววัดทีเอฟ-ไอดีเอฟ เพื่อบอกความเกี่ยวข้องของเอกสารผลลัพธ์จากการค้นคืน ต่อจากนั้น ส่วนการนำเสนอสารสนเทศจะทำการเรียงลำดับและจัดรูปแบบสารสนเทศตามที่กำหนดไว้ก่อนหน้าด้วยภาษาเอกซ์เอสแอลทีเพื่อแปลงข้อมูลเอกซ์เอ็มแอลเป็นเอชทีเอ็มแอล ผลลัพธ์การค้นคืนสารสนเทศจากการทดลองในงานวิจัยนี้ถูกประเมินด้วยค่าพรีซิชัน รีคอล และค่าเอฟ ได้ค่าเฉลี่ยที่ 83% 84% และ 83% ตามลำดับ ซึ่งอยู่ในระดับดีปานกลาง	en_US
dc.description.abstractalternative	Searching for a cluster of documents with context relevance is challenging as it is difficult to assess whether those documents contain relevant contents and satisfy the user needs. This research therefore presents a Collaborative Information Framework for retrieving the proper and interesting contents from the structured documents in XML format. The proposed approach consists of two main components, which are the part of document information retrieval, and the part of information presentation. The document information retrieval component is in charge of document decomposition, and collection of the proper contexts satisfying user needs with the XML searching technique. The XQuery language and the method of index tagging by XPath language using controlled vocabularies composed of keywords and synonyms. The set of documents resulting from searching will then be clustered by k-Means algorithm, and the measure of TF-IDF for examining the context relevance. Next, the information presentation component will re-order and re-format the obtained information based on the predefined templates using XSLT language to transform XML data to HTML. The results of information retrieval from the experiment in this study, evaluated with the values of Precision, Recall, and F-measure, yield the averages of 83%, 84%, and 83 %, respectively that can be rated moderate.	en_US
dc.language.iso	th	en_US
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.relation.uri	http://doi.org/10.14457/CU.the.2013.1285
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.subject	โปรแกรมคอมพิวเตอร์
dc.subject	ระบบการจัดเก็บและค้นข้อสนเทศ
dc.subject	Computer programs
dc.subject	Information storage and retrieval systems
dc.title	กรอบงานสารสนเทศควบรวมสำหรับการค้นคืนเอกสารมีโครงสร้างในองค์กร	en_US
dc.title.alternative	COLLABORATIVE INFORMATION FRAMEWORK FOR STRUCTURED DOCUMENT RETRIEVAL IN ORGANIZATION	en_US
dc.type	Thesis	en_US
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	en_US
dc.degree.level	ปริญญาโท	en_US
dc.degree.discipline	วิทยาศาสตร์คอมพิวเตอร์	en_US
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	en_US
dc.email.advisor	yachai.l@chula.ac.th	en_US
dc.identifier.DOI	10.14457/CU.the.2013.1285