การสกัดตารางและรายการบนเว็บเป็นอาร์ดีเอฟ

จุลเทพ นันทขว้าง

dc.contributor.advisor	ประภาส จงสถิตย์วัฒนา
dc.contributor.author	จุลเทพ นันทขว้าง
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
dc.date.accessioned	2021-09-22T23:25:21Z
dc.date.available	2021-09-22T23:25:21Z
dc.date.issued	2563
dc.identifier.uri	http://cuir.car.chula.ac.th/handle/123456789/77057
dc.description	วิทยานิพนธ์ (วศ.ด.)--จุฬาลงกรณ์มหาวิทยาลัย, 2563
dc.description.abstract	ทุกวันนี้ ลิงก์เดต้าได้เติบโตเพิ่มขึ้นอย่างรวดเร็วตามการเติบโตของเว็บ นอกเหนือจากข้อมูลใหม่ที่สร้างขึ้นในรูปแบบซีแมนติกโดยเฉพาะ ส่วนหนึ่งมาจากการแปลงข้อมูลโครงสร้างที่มีอยู่ให้อยู่ในรูปแบบของข้อมูลเปิดระดับห้าดาว อย่างไรก็ตามยังคงมีข้อมูลจำนวนมากในรูปแบบโครงสร้างและกึ่งโครงสร้าง ตัวอย่างเช่นตารางและรายการซึ่งเป็นรูปแบบหลักที่มนุษย์ใช้อ่าน ยังรอการแปลงอยู่ งานวิจัยนี้กล่าวถึงงานวิจัยต่าง ๆ ที่เกี่ยวกับการแปลงตารางและรายการมาเป็นข้อมูลในรูปแบบต่าง ๆ เพื่อให้เครื่องสามารถอ่านได้ นอกจากนี้ยังเสนอวิธีการในการแปลงตารางและรายการเป็นรูปแบบ Resource Description Framework และยังคงเก็บโครงสร้างต้นฉบับที่จำเป็นไว้อย่างละเอียด ซึ่งทำให้สามารถที่จะสร้างข้อมูลโครงสร้างเดิมกลับมาได้ ระบบ TULIP ถูกสร้างขึ้นเพื่อเป็นเครื่องมือสำหรับการพัฒนาซีแมนติกเว็บ วิธีการที่เสนอมีความยืดหยุ่นมากกว่าเมื่อเทียบกับงานอื่น ๆ เดต้าโมเดลของ TULIP สามารถรองรับการเก็บข้อมูลต้นฉบับอย่างครบถ้วน และสามารถนำมาแสดงใหม่ในมุมมองที่แตกต่างไปจากเดิม เครื่องมือนี้สามารถใช้สร้างข้อมูลจำนวนมหาศาลสำหรับเครื่องคอมพิวเตอร์เพื่อให้ใช้งานได้กว้างมากขึ้นกว่าเดิม
dc.description.abstractalternative	Currently, Linked Data is increasing at a rapid rate as the growth of the Web. Aside from new information that has been created exclusively as Semantic Web-ready, part of them comes from the transformation of existing structural data to be in the form of five-star open data. However, there are still many legacy data in structured and semi-structured form, for example, tables and lists, which are the principal format for human-readable, waiting for transformation. This work discusses attempts in the research area to transform table and list data to make them machine-readable in various formats. Furthermore, the research proposes a method for transforming tables and lists into Resource Description Framework format while maintaining their essential configurations thoroughly. It is possible to recreate their original form back informatively. A system named TULIP has been developed which embodied this conversion method as a tool for the future development of the Semantic Web. The proposed method is more flexible compared to other works. The TULIP data model contains complete information of the source; hence it can be projected into different views. This tool can be used to create a tremendous amount of data for machines to be used at a broader scale.
dc.language.iso	th
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย
dc.relation.uri	http://doi.org/10.58837/CHULA.THE.2020.1131
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย
dc.subject.classification	Computer Science
dc.subject.classification	Computer Science
dc.title	การสกัดตารางและรายการบนเว็บเป็นอาร์ดีเอฟ
dc.title.alternative	Extraction of tables and lists on the web to RDF
dc.type	Thesis
dc.degree.name	วิศวกรรมศาสตรดุษฎีบัณฑิต
dc.degree.level	ปริญญาเอก
dc.degree.discipline	วิศวกรรมคอมพิวเตอร์
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย
dc.identifier.DOI	10.58837/CHULA.THE.2020.1131