โครงสร้างข้อมูลสำหรับพจนานุกรมอิเล็กทรอนิกส์ภาษาไทย

สมปรารถนา รัทยานนท์

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/48365

Title:	โครงสร้างข้อมูลสำหรับพจนานุกรมอิเล็กทรอนิกส์ภาษาไทย
Other Titles:	Data structure for Thai electronic dictionaty
Authors:	สมปรารถนา รัทยานนท์
Advisors:	วิลาศ วูวงศ์ สุยุชน์ สัตยประกอบ
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. บัณฑิตวิทยาลัย
Advisor's Email:	ไม่มีข้อมูล ไม่มีข้อมูล
Issue Date:	2535
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	พจนานุกรมอิเล็กทรอนิกส์เป็นแหล่งเก็บข้อมูลสำหรับงานประมวลด้านภาษาศาสตร์ เช่น การตัดคำและการตรวจอบตัวสะกดในโปรแกรมประมวลผลคำ การวิเคราะห์ไวยากรณ์ในงานประมวลผลภาษาธรรมชาติ จากความก้าวน้าของงานการประมวลผล ภาษาไทยด้วยคอมพิวเตอร์ที่มีมากขึ้น ทำให้มีความสนใจในการพัฒนาพจนานุกรมอิเล็กทรอนิกส์ภาษาไทยมากขึ้นตามไปด้วย การวิจัยครั้งนี้มีวัตถุประสงค์เพื่อพัฒนาพจนานุกรมอิเล็กทรอนิกส์ภาษาไทยโดยใช้โครงสร้างข้อมูลแบบเบิลอะเรย์ที่มีการสืบค้นแบบดิจิตอล 2 แบบ ดังนี้คือ แบบแรก เป็นพจนานุกรมอิเล็กทรอนิกส์ภาษาไทยที่จัดเก็บคำศัพท์ต่างๆ โดยตรง ส่วนแบบที่ 2 เป็นพจนานุกรมอิเล็กทรอนิกส์ภาษาไทยที่จัดเก็บคำโดดที่ได้มาจากการแยกคำศัพท์ ซึ่งผลของการจัดเก็บได้ว่าเนื้อที่ที่ใช้ในการเก็บพจนานุกรมอิเล็กทรอนิกส์ภาษาไทยแบบที่ 2 น้อยกว่าแบบแรก นอกจากนี้ ยังพัฒนาอัลกอริทึมการัดคำโดยใช้พจนานุกรมให้ผลลัพธ์เป็นคำศัพท์ทุกคำที่ปรากฏในพจนานุกรมอิเล็กทรอนิกส์ภาษาไทย สำหรับผลการทดสอบประสิทธิภาพการทำงานพบว่า อัลกอริทึมการสืบค้นคำศัพท์ของพจนานุกรมอิเล็กทรอนิกส์ภาษาไทยที่พัฒนาขึ้นใช้เวลามากกว่าพจนานุกรมของมหาวิทยาลัยเกษตรศาสตร์ แต่สามารถนำอัลกอริทึมกล่าวมาเป็นแนวทางในการพัฒนาอัลกอริทึมการตัดคำโดยใช้พจนานุกรมได้โดยง่าย
Other Abstract:	Electronic dictionaries are storages of machine readable lexical items. They are used in advanced word processors for word segmentation and spelling checkers, and natural language processing for syntactic, semantic and discourse analysis. Due to the advancement in computer processing of Thai language, more attention has recently been paid to the development of Thai electronic dictionaries. This study proposes and develops two frameworks for Thai electronic dictionaries employing double-array digital search tree. The first framework treats a Thai word as a lexical entity and directly applies the double-array digital search tree to store lexica entities. The second one recognizes the fact that many Thai words are composed of few isolated words, and therefore stores a word in terms of a few isolated words, resulting in less storage space. Based on the dictionaries, a Thai word segmentation algorithm has been developed which produces all possible segmentation. Experiments have been conducted to evaluate performance of the proposed frameworks. It has been found out that it takes more time for the frameworks to retrieve words than the Kasetsart University's approach but they allow the development of simple word segmentation algorithm.
Description:	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2535
Degree Name:	วิทยาศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิทยาศาสตร์คอมพิวเตอร์
URI:	http://cuir.car.chula.ac.th/handle/123456789/48365
ISBN:	9745811769
Type:	Thesis
Appears in Collections:	Grad - Theses

Files in This Item:

File	Size	Format
Somprathana_ra_front.pdf	1.37 MB	Adobe PDF	View/Open
Somprathana_ra_ch1.pdf	751.93 kB	Adobe PDF	View/Open
Somprathana_ra_ch2.pdf	2 MB	Adobe PDF	View/Open
Somprathana_ra_ch3.pdf	2.52 MB	Adobe PDF	View/Open
Somprathana_ra_ch4.pdf	2.52 MB	Adobe PDF	View/Open
Somprathana_ra_ch5.pdf	977.87 kB	Adobe PDF	View/Open
Somprathana_ra_ch6.pdf	701.18 kB	Adobe PDF	View/Open
Somprathana_ra_ch7.pdf	458.27 kB	Adobe PDF	View/Open
Somprathana_ra_back.pdf	1.66 MB	Adobe PDF	View/Open

Show full item record